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INTRODUCTION 

Normal  breast  epithelial  cells  undergo  both  genetic  and  epigenetic  changes  in  their  malignant 
progression  to  cancer  (1).  In  that  progression,  changes  in  proto-oncogenes  and  tumor  suppressor 
genes  play  an  important  role.  Because  the  (sub-)  chromosomal  loss  (or  decrease)  of  tumor 
suppressor  genes  and  the  gain  (or  amplification)  of  proto-oncogenes  may  confer  cells  with 
growth  advantages,  these  changes  provide  an  important  tool  to  identify  genes  that  are  important 
in  carcinogenesis.  Several  consistent  changes  have  been  reported,  from  karyotyping  analyses  of 
breast  tumors  and  cancer  cell  lines  (2,  3).  By  scanning  the  changes  in  gene  expression  and  DNA 
methylation,  we  proposed  to  identify  tumor  suppressor  genes  in  the  chromosomal  regions  of 
lp33-pter,  8p,  and  18q,  three  of  the  regions  that  are  most  consistently  decreased  in  copy  number. 


BODY 

The  following  four  tasks  and  timelines  were  proposed: 

Task  1.  To  construct  oligodeoxynucleotide  microarrays  representing  Notl  sites  in  the 
chromosomal  regions  of  lp33-pter,  8p,  and  18q  (months  1-12): 

•  Identify  all  the  Notl  sites  in  chromosomal  regions  lp33-pter,  8p,  and  18q  followed  by 
determining  the  size  of  the  Notl-Msel  DNA  fragments; 

•  Select,  out  of  roughly  800  Notl  sites  in  those  regions,  more  than  500  gene  promoter- 
associated  Notl  sites  that  are  located  within  2  kbp  from  the  nearest  Msel  sites; 

•  Design  ~50— 60mer  oligodeoxynucleotides  representing  those  Notl-Msel  DNA  fragments; 
and 

•  Prepare  oligodeoxynucleotide  microarrays. 

Task  2.  To  identify  tumor  suppressor  gene  candidates  by  DNA  microarray  MS-AFLP  (months  9- 
30): 

•  Perform  the  DNA  microarray  MS-AFLP  hybridization  experiments  using  genomic  DNA 
from  normal  breast  epithelial  cells  and  three  breast  cancer  cell  lines  (MCF7,  BT-20,  and  MDA- 
MB468);  and 

•  Identify  the  Notl  sites  that  exhibit  a  decrease  in  spot  intensity  in  the  cells  of  those  three  breast 
cancer  cell  lines,  followed  by  the  examination  of  neighboring  genes. 

Task  3.  To  examine  gene  expression  by  SM  RT-PCR  (months  19-36): 

•  Establish  the  SM  RT-PCR  system  composed  of  ~10  candidate  genes  located  in  the  3 
chromosomal  regions  indicated  in  Task  1; 

•  Perform  SM  RT-PCR  experiments  using  RNA  from  normal  breast  epithelial  cells  and  three 
breast  carcinoma  cell  lines  (MCF7,  BT-20,  and  MDA-MB468);  and 

•  Identify  genes  that  are  down-regulated  in  expression  in  those  breast  cancer  cell  lines. 

Task  4.  To  examine  homozygous  deletion  by  SM  PCR  (months  19-36): 

•  Perform  the  SM  PCR  experiments  using  genomic  DNA  from  normal  breast  epithelial  cells 
and  the  same  three  breast  carcinoma  cell  line  cells;  and 

•  Identify  genes  that  are  homozygously  deleted  in  those  breast  cancer  cell  lines. 

In  Year  1,  we  planned  to  perform  Task  1  and  part  of  Task  2.  We  started  with  Task  1  and 
identified  all  the  Notl  sites  in  the  chromosomal  regions  of  lp33-pter,  8p,  and  18q  for  the  Notl- 
Msel  MS-AFLP  analysis  (4,  5).  We  soon  obtained  preliminary  results  indicating  that  the 
proposed  detection  method  of  DNA  methylation  alterations  would  allow  the  coverage  of  a  lesser 
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number  of  genes  than  we  expected.  This  was  because  Notl  sites  tended  to  cluster  rather  than 
spread  evenly  over  the  genome.  For  example,  the  chromosomal  region  of  18q  contains  411  genes 
and  174  Notl  sites.  If  one  Notl  site  is  present  per  gene,  more  than  40%  of  the  genes  will  be 
examined  for  changes  in  DNA  methylation.  However,  fewer  than  80  genes  were  demonstrated 
to  actually  possess  Notl  sites  in  their  promoter  regions.  This  amounts  to  only  20%  of  the  genes  in 
that  region.  Assuming  that  the  sensitivity  of  the  Notl-Msel  DNA  microarray  MS-AFLP  method 
is  75%  using  oligonucleotide  probes  as  we  previously  determined,  we  can  analyze  only  60  out  of 
the  411  genes.  Additionally,  the  methylation  status  of  the  promoter  region  of  one  gene  may  not 
necessarily  characterize  the  methylation  statuses  of  its  neighboring  genes  or  coincide 
reciprocally  with  the  transcription  of  the  gene.  Therefore,  we  directly  proceeded  to  Tasks  3  and  4 
to  identify  the  genes  that  exhibit  differences  in  expression  and  copy  number  between  the  primary 
culture  of  normal  mammary  epithelial  cells  and  established  breast  cancer  cell  lines. 

We  examined  the  expression  of  127  genes  in  the  chromosomal  region  of  18q21-qter  in 
normal  and  cancerous  breast  cells  and  tissues.  Rather  than  analyze  the  entire  region  of  18q,  we 
analyzed  the  genes  in  the  chromosomal  region  of  18q21-qter.  By  focusing  our  efforts  on  the 
genes  located  in  cytobands  where  potential  tumor  suppressor  genes  are  likely  be  located  (6),  the 
total  number  of  protein-coding  genes  for  analysis  was  reduced  from  223  to  140.  We  used  two 
different  techniques  to  examine  gene  expression:  systematic  multiplex  RT-PCR  (SM  RT-PCR) 
and  DNA  microarray  hybridization.  We  used  the  Illumina  BeadChips  for  DNA  microarray 
hybridization.  We  identified  several  interesting  genes  that  exhibited  differences  in  gene 
expression.  Partial  or  entire  loss  of  expression  was  observed  in  genes  such  as  CCBE 1 ,  CCDC 1 1 , 
CD226,  NP_1 15536.1,  NP_689683.2,  RNF152,  SERPINB8,  and  TCF4  in  a  majority  of  breast 
cancer  cell  lines  that  were  examined.  An  increase  in  gene  expression  was  rare,  but  found  with  the 
transcription  factor  ONECUT2  gene  in  all  the  cancer  cell  lines.  We  further  examined  the 
expression  of  the  selected  genes  from  18q21-qter  by  real-time  qRT-PCR.  We  did  this  not  only 
with  the  cDNA  specimens  from  breast  cancer  cell  lines  that  were  previously  used  in  the  SM  RT- 
PCR  studies,  but  also  with  the  cDNA  specimens  prepared  from  the  matched  pairs  of  normal  and 
cancerous  breast  tissues  from  breast  cancer  patients.  Real-time  qRT-PCR  experiments  confirmed 
that  the  SM  RT-PCR  results  obtained  with  the  breast  cancer  cell  lines  were  correct.  Analysis  of 
clinical  specimens  of  breast  cancer  demonstrated  that  the  gene  expression  of  CCBE1,  TCF4, 
NP_1 15536.1,  and  NP_689683.2  was  down-regulated  in  the  majority  of  clinical  cases  of  breast 
cancer.  We  also  performed  copy  number  analysis  by  SM  PCR  and  also  by  arrayCGH.  We  found 
homozygous  deletions  of  the  SMAD4  and  ELAC1  genes  in  the  MDA-MB468  breast  cancer  cell 
line. 


In  Year  2,  we  examined  the  expression  of  273  genes  located  on  the  p-arm  of  chromosome  8 
in  breast  cancer  cell  lines  by  SM  RT-PCR  and  DNA  microarray  hybridization  using  the  Illumina 
BeadChips.  We  observed  frequent  decreases  in  expression  of  approximately  two-dozen  genes 
and  increases  in  expression  of  several  genes  on  this  chromosomal  arm.  These  changes  in  gene 
expression  of  the  cell  lines  were  later  confirmed  by  real-time  qRT-PCR.  Additionally  and  more 
importantly,  we  found  that  a  number  of  these  variations  were  also  observed  in  the  majority  of 
clinical  cases  of  breast  cancer  that  we  examined.  These  included  down-regulation  of  the 
MYOM2,  NP_859074,  NP_00 1034551,  NRG1,  PHYIP  (PHYHIP),  Q7Z2R7,  SFRP1,  and  SOX7 
genes  and  up-regulation  of  the  ESC02,  NP_1 15712  (GINS4),  Q6P464,  and  TOPK  (PBK)  genes. 
We  did  not  observe  any  genes  that  were  homozygously  deleted  in  the  breast  cancer  cell  lines 
examined  by  SM  PCR  and  arrayCGH. 

In  Year  3,  we  examined  the  expression  of  624  genes  in  the  chromosomal  region  of  lp33- 
pter  by  DNA  microarray  hybridization  using  Illumina  BeadChips.  We  also  analyzed  the 
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expression  of  some  of  the  genes  by  SM  RT-PCR.  As  opposed  to  the  chromosomal  regions  of 
18q21-qter  and  8p,  we  did  not  find  particularly  interesting  candidate  genes  that  exhibited  down- 
regulation  in  expression  in  the  lp33-pter  region.  Although  there  were  genes  that  exhibited 
decreased  expression  in  several  cancer  cell  lines  in  comparison  with  primary  cultures  of  normal 
epithelial  cells,  the  frequency  was  low  for  most  of  the  genes.  The  PLCH2  gene  was  down- 
regulated  in  a  majority  of  breast  cancer  cell  lines  that  were  analyzed.  However,  the  expression  in 
breast  tumor  tissues  increased.  A  few  other  genes,  such  as  HES5  and  AJAP1,  seem  to  be  down- 
regulated  in  a  majority  of  cell  lines  and  await  further  examination  by  real-time  qRT-PCR  for 
confirmation. 

Rather  than  determining  the  methylation  statuses  of  the  promoter  regions  of  the  candidate 
genes  with  potential  tumor  suppressor  activity,  we  proceeded  to  construct  the  cDNA  expression 
constructs  of  many  of  these  candidate  genes  in  a  eukaryotic  expression  vector,  pcDNA3.1/V5- 
His.  This  decision  was  based  on  our  belief  that  gene  expression  is  more  important  than 
methylation  of  the  promoter  for  gene  functionality. 


KEY  RESEARCH  ACCOMPLISHMENTS 

We  have  just  finished  the  third  year  of  the  3-year  project.  The  key  research  accomplishment  for 
the  entire  period  of  research  is  the  determination  of  changes  in  expression  of  the  genes  in  the 
chromosomal  regions  of  18q21-qter,  8p,  and  lp33-pter.  We  did  the  analysis  not  only  by  the  DNA 
microarray  hybridization  method,  but  also  by  the  SM  RT-PCR  method.  This  use  of  these  two 
methods  complemented  each  other.  The  DNA  microarray  hybridization  allows  quantitative 
measurement  for  moderately  to  highly  expressed  genes  but  often  fails  to  detect  weakly  expressed 
genes.  The  SM  RT-PCR  method  allows  semi-quantitative  detection  of  weakly  expressed  genes. 
Our  study  has  identified  more  than  a  dozen  genes  that  are  down-regulated  in  gene  expression  in 
breast  cancer  cell  lines  and  also  in  clinical  specimens  of  breast  cancer  in  the  chromosomal 
regions.  We  have  constructed  eukaryotic  expression  constructs  of  these  genes,  and  they  are 
waiting  to  be  tested  for  functionality  in  tumor  suppression  activity. 


REPORTABLE  OUTCOMES 

We  have  already  published  the  results  obtained  from  the  copy  number  and  expression  analysis  of 
the  genes  in  the  chromosomal  region  of  18q21-qter  (7).  The  PDF  of  the  paper  is  included  in  the 
Appendices.  We  have  submitted  a  manuscript  describing  the  results  from  the  copy  number  and 
expression  analysis  of  the  genes  in  the  chromosomal  arm  of  8p  and  are  waiting  for  the  outcome 
of  the  peer  review.  The  PDF  of  the  manuscript  is  also  included  in  the  Appendices.  We  have  not 
yet  finished  the  SM  RT-PCR  analysis  of  many  of  the  genes  in  the  chromosomal  region  of  lp33- 
pter.  Although  the  DNA  microarray  hybridization  experiments  of  all  the  genes  in  the  region  did 
not  identify  any  particularly  intriguing  candidate  genes,  SM  RT-PCR  may  identify  several  genes 
that  are  down-regulated  in  breast  cancer  among  the  genes  that  are  weakly  expressed.  Once  the 
work  is  completed,  we  will  publish  the  results.  Additional  funding  will  be  necessary  to  perform 
DNA  transfection  experiments  of  the  eukaryotic  expression  constructs  of  those  candidate  genes 
that  have  been  identified  in  this  study,  and  we  are  trying  to  obtain  the  funding  to  secure  the 
continuation  of  this  important  research. 


CONCLUSIONS 

We  have  learned  several  important  lessons  from  the  completed  research.  First,  methylation  status 
of  gene  promoter  does  not  always  coincide  with  gene  transcription  activity.  The  genes  with 
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hypomethylated  promoter  are  occasionally  unexpressed.  On  the  other  hand,  the  genes  with 
hypermethylated  promoter  tend  to  be  unexpressed  as  anticipated.  Second,  differences  in  gene 
expression  between  the  primary  cultures  of  normal  breast  epithelial  cells  and  established  breast 
cancer  cell  lines  are  often  reproduced  in  the  clinical  specimens  of  normal  and  cancerous  pairs  of 
breast  tissues.  However,  there  are  cases  where  the  differences  were  reversed  in  the  clinical 
specimens.  The  most  prominent  case  was  found  with  the  PLCH2  gene,  where  the  expression  is 
down-regulated  in  all  the  breast  cancer  cell  lines  examined,  but  up-regulated  in  a  majority  of 
tumors.  One  possible  explanation  is  that  the  changes  in  the  cells  other  than  epithelial  cells  in 
cancerous  tissues  may  be  responsible.  Because  this  phenomenon  may  implicate  an  important 
interaction  between  breast  cancer  cells  and  normal  cells  surrounding  them  in  tissue,  we  would 
like  to  determine  the  cause  of  this  discrepancy  and  report  it  in  future  results. 

The  expression  analysis  of  the  genes  in  the  chromosomal  regions  that  are  frequently  deleted 
in  breast  cancer  resulted  in  the  identification  of  candidate  genes  with  potential  tumor  suppressor 
activity.  Both  somewhat  characterized  genes  and  previously  uncharacterized  genes  are  found 
among  them.  Because  the  decrease  or  loss  of  gene  expression,  even  if  it  is  combined  with  the 
decreased  copy  number,  is  not  sufficient  to  demonstrate  the  functionality  in  tumor  suppression, 
we  will  need  to  proceed  to  examine  the  activity  by  DNA  transfection  of  the  expression  constructs 
of  those  candidate  genes. 
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Research  Article 

Scanning  copy  number  and  gene  expression 
on  the  18q21-qter  chromosomal  region  by 
the  systematic  multiplex  PCR  and  reverse 
transcription-PCR  methods 


We  examined  differences  in  copy  number  and  expression  of  127  genes  located  on  the 
18q21-qter  chromosomal  region  of  the  breast  and  prostate  cancer  cell  lines,  using  the  sys¬ 
tematic  multiplex  PCR  and  reverse  transcription-PCR  (SM  PCR  and  SM  RT-PCR)  methods 
that  we  developed.  Semi-quantitative  data  were  obtained  that  were  comparable  in  quality, 
but  not  in  quantity,  to  data  from  DNA  microarray  hybridization  analysis.  In  the  chromoso¬ 
mal  region  where  losses  are  frequent  in  breast,  prostate,  and  other  cancers,  we  detected  a 
homozygous  deletion  of  the  SMAD4  gene  in  the  MDA-MB-468  breast  cancer  cell  line.  We 
also  observed  partial  or  entire  loss  of  expression  in  genes  such  as  CCBE1,  CCDC11,  CD226, 
NP_115536.1,  NP_689683.2,  RNF152,  SERPINB8,  and  TCF4  in  certain  breast  and/or 
prostate  cancer  cell  lines.  An  increase  in  gene  expression  was  rare,  but  found  with  the 
transcription  factor  ONECUT2  gene  in  all  of  the  cancer  cell  lines  examined.  Real-time  qRT- 
PCR  experiments  confirmed  these  SM  RT-PCR  results.  Further  analysis  of  clinical  speci¬ 
mens  of  breast  cancer  by  real-time  qRT-PCR  demonstrated  that  the  gene  expression  of 
CCBE1,  TCF4,  NP_115536.1,  and  NP_689683.2  was  downregulated  in  the  majority  of  clin¬ 
ical  cases  of  breast  cancer. 


Keywords: 
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1  Introduction 

During  cancer  progression,  normal  cells  undergo  many 
complex  changes,  both  genetic  and  epigenetic,  at  either  the 
nucleotide  or  (sub-)chromosomal  level.  Oncogenes  and 
tumor  suppressor  genes  play  important  roles  in  promoting 
and  inhibiting  carcinogenesis,  respectively  [1].  Proto-onco¬ 
genes  are  activated  by  gene  amplification,  up-regulation  of 
transcription  and  activating  mutations,  whereas  tumor  sup¬ 
pressor  genes  are  inactivated  by  loss  of  the  genes,  transcrip¬ 
tional  silencing,  and  inactivating  mutations.  Therefore,  the 
examination  of  copy  number  and  expression  may  help  to 
identify  genes  involved  in  carcinogenesis. 

Hoping  to  eventually  identify  potential  tumor  sup¬ 
pressor  genes  of  both  breast  and  prostate  cancers,  we  tar¬ 
geted  the  q21-qter  region  of  chromosome  18  among  the 
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chromosomal  regions  that  have  been  reported  to  be  fre¬ 
quently  lost  in  cancer.  In  that  region,  a  few  genes  (SMAD2, 
SMAD4  and  BCL2)  have  been  linked  to  carcinogenesis. 
SMAD4,  homolog  4  of  the  Drosophila  ’mothers  against  de- 
capentaplegic’  (Mad)  gene,  is  a  cancer  predisposition  gene 
with  tumor  suppressor  activity.  The  germline  mutations  of 
the  gene  cause  the  familial  juvenile  polyposis,  which  is  an 
autosomal  dominant  disease  characterized  by  a  predisposi¬ 
tion  to  hamartomatous  polyps  and  gastrointestinal  cancer 
[2,  3].  In  addition  to  the  germline  changes,  homozygous 
deletion  of  the  SMAD4  gene  was  prevalent  in  pancreatic 
carcinomas,  and  somatic  mutations  were  identified  in  some 
of  the  carcinomas  that  lacked  deletions  [4].  Although 
SMAD4  inactivation  was  also  found  with  breast,  ovarian, 
and  other  cancers,  it  was  distinctly  uncommon  (less  than 
10%)  in  other  tumor  types  [5].  SMAD2  and  BCL2  are  not 
cancer  predisposition  genes;  however,  somatic  changes  have 
been  revealed.  BCL2  (B-cell  leukemia  2)  is  a  proto-oncogene 
and  it  was  cloned  from  the  junction  of  t(14;18)  translocation 
characteristic  of  follicular  lymphoma  [6].  The  BCL2  protein 
is  localized  in  mitochondria  and  when  overexpressed  it 
interferes  with  programmed  cell  death  independent  of  pro¬ 
moting  cell  division  [7].  SMAD2,  another  homolog  of  the 
Drosophila’s  Mad  gene,  may  play  a  role  as  a  tumor  sup- 
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pressor  gene  in  a  small  fraction  (less  than  10%)  of  colorectal 
cancers.  Therefore,  we  cautiously  assume  that  additional 
cancer  genes  may  be  present  in  the  region. 

There  are  not  many  genes  in  the  18q21-qter  region,  ap¬ 
proximately  140,  so  we  used  the  systematic  multiplex  (SM) 
PCR  and  SM  reverse  transcription  (RT)-PCR  methods, 
which  we  developed  for  semiquantitative  analyses  of  copy 
number  and  gene  expression  [8-11]. 

2  Materials  and  methods 

2.1  Genomic  DNA  and  cDNA 

The  following  genomic  DNA  samples  were  used  for  the  SM 
PCR  experiments  of  the  genes  on  the  18q21-qter  region:  a 
normal  tissue  and  a  primary  tumor  of  breast  from  an  indi¬ 
vidual  with  invasive  ductal  carcinoma  and  its  metastasized 
tumor  to  lymph  node,  a  normal  and  a  primary  tumor  tissue 
of  prostate  from  an  individual  with  prostate  adenocarci¬ 
noma,  primary  cultures  of  normal  breast  and  prostate  epi¬ 
thelial  cells,  and  from  six  mammary  (MCF7,  MDA-MB-468, 
MDA-MB-231,  BT-20,  T-47D,  and  Hs-578T)  and  four  prostate 
(PC3,  DU145,  LNCaP,  and  MDA  PCa2b)  carcinoma  cell 
lines.  The  primary  cultures  (HMEC  and  PrEC)  were  pur¬ 
chased  from  Cambrex,  and  the  cancer  cell  lines  were  origi¬ 
nally  obtained  from  ATCC.  High-quality  DNA  preparations 
were  confirmed  by  gel  electrophoresis.  Genomic  DNA  from 
MCF7,  MDA-MB-468,  and  BT-20  breast  cancer  cell  lines  was 
also  used  in  the  arrayCGH  experiments. 

For  the  expression  analysis  by  SM  RT-PCR,  we  used  the 
following  RNA  samples:  a  normal  and  a  primary  tumor  tis¬ 
sue  of  breast  from  an  individual  with  invasive  ductal  carci¬ 
noma,  a  normal  and  a  primary  carcinoma  tissue  of  prostate, 
another  normal  prostate  tissue,  and  a  hyperplastic  prostate 
tissue,  five  mammary  (MCF7,  MDA-MB-468,  MDA-MB-231, 
BT-20,  and  T-47D)  and  three  prostate  (PC3,  DU145,  and 
LNCaP)  cancer  cell  lines,  and  from  primary  cultures  of  nor¬ 
mal  mammary  and  prostate  epithelial  cells.  Scarcity  of  deg¬ 
radation  was  confirmed  with  RNA  specimens  by  gel  electro¬ 
phoresis.  Total  RNA  was  used  to  prepare  cDNA  by  RT  using 
oligo  dT as  a  primer  and  Advantage  RT-for-PCR  Kit  (BD  Bio- 
sciences-Clontech).  These  RNA  and  cDNA  samples  were 
also  used  in  the  DNA  microarray  hybridization  and  real-time 
qRT-PCR  experiments,  respectively.  Additionally,  cDNA 
samples  prepared  from  12  matched  pairs  of  normal  and 
tumor  breast  tissues  were  used  in  the  real-time  qRT-PCR 
experiments. 

2.2  SM  PCR  and  SM  RT-PCR  experiments  to  measure 
and  determine  copy  number  and  expression  of 
the  genes  on  the  18q21-qter  region 

The  detailed  experimental  procedures  to  establish  the  SM 
(RT-)PCR  system  have  been  previously  described  [8-11]. 
Briefly,  the  genes  were  categorized  into  groups  of  approxi¬ 


mately  ten  genes,  and  the  concentrations  of  PCR  primers  in 
multiplex  reactions  were  optimized  to  amplify  different  sizes 
of  DNA  fragments  in  single  exons  at  similar  band  intensities 
using  genomic  DNA  from  normal  human  tissues  as  a  con¬ 
trol.  Genomic  DNA  and  cDNA  from  the  human  cells  and 
tissues  were  used  to  examine  the  copy  number  and  expres¬ 
sion  of  the  genes  on  the  chromosomal  region  of  18q21-qter, 
respectively.  After  SM  (RT-)PCR,  small  aliquots  of  reaction 
products  were  analyzed  by  an  8%  polyacrylamide  gel  elec¬ 
trophoresis,  followed  by  staining  with  ethidium  bromide. 
The  gel  pictures  were  taken  and  saved  in  TIFF  format,  the 
band  intensity  was  measured  using  the  ImageQuant  soft¬ 
ware  (Amersham  Biosciences)  and  normalized  by  adjusting 
the  average  band  intensities  of  individual  gels. 

2.3  DNA  microarray  hybridization  experiments  to 
measure  and  determine  copy  number  and 
expression  of  the  genes  on  the  18q21-qter  region 

The  copy  number  was  also  analyzed  by  DNA  microarray 
hybridization  using  the  arrayCGH  method  [12-14].  The 
changes  in  copy  number  were  examined  of  three  breast  can¬ 
cer  cell  lines  (MCF7,  MDA-MB-468,  and  BT-20)  at  Nimble- 
Gen  (Madison,  WI,  USA).  Genomic  DNA  from  normal 
females  was  used  as  a  reference  for  all  the  three  hybridiza¬ 
tion  experiments.  Relative  fluorescence  intensity,  which  is 
indicative  of  relative  copy  number,  was  determined  over  the 
entire  human  genome  with  385  000  isothermal  long  oligo¬ 
nucleotide  probes  tiled  through  genic  and  intergenic  regions 
at  a  median  probe  spacing  of  6000  base  pairs. 

The  expression  analysis  was  also  performed  by  DNA 
microarray  hybridization.  Total  RNA  was  used  from  a  nor¬ 
mal  breast  tissue,  a  normal  prostate  tissue,  primary  cultures 
of  normal  mammary  and  prostate  epithelial  cells,  five  mam¬ 
mary  (MCF7,  MDA-MB-468,  MDA-MB-231,  BT-20,  and  T- 
47D)  and  three  prostate  (PC3,  DU145,  and  LNCaP)  cancer 
cell  lines.  The  same  preparations  of  RNA  that  were  used  in 
SM  RT-PCR  were  used  in  the  microarray  hybridization.  Illu- 
mina’s  Sentrix  Human-6  Expression  BeadChips,  which 
represented  probes  from  the  entire  23  000  RefSeq  collection 
and  an  additional  23  000  other  expressed  sequences,  were 
hybridized  with  the  biotinylated  cRNA  that  was  prepared 
following  the  manufacturer’s  protocol.  After  hybridization 
and  washing,  the  BeadChips  were  treated  with  streptavidin- 
Cy3,  washed,  dried,  and  scanned  for  fluorescence  intensity, 
using  a  BeadStation  500  that  was  equipped  at  the  DNA 
Microarray  Facility  at  Burnham  Institute  for  Medical  Re¬ 
search.  Raw  data  were  generated  and  then  normalized  using 
the  Beadscan  3.0  software.  The  unique  30X  average  redun¬ 
dancy  feature  of  the  BeadChips  allows  absolute  signal  detec¬ 
tion  of  a  single  fluorescence.  Other  microarrays  require  the 
CGH-type  hybridization  using  two  different  kinds  of  fluo¬ 
rescence  followed  by  determination  of  relative  signal  inten¬ 
sity  due  a  wide  variation  in  the  amounts  of  probes  printed  on 
different  slides. 
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Data  for  the  genes  and  sequences  on  the  chromosomal 
region  of  18q21-qter  were  extracted  from  the  results  of  the 
DNA  microarray  hybridization  experiments  and  used  for 
comparison  with  the  results  from  the  SM  PCR  and  SM  RT- 
PCR  experiments. 

2.4  Real-time  qRT-PCR  experiments  to  measure  and 
determine  expression  of  the  selected  genes  on 
the  18q21-qter  region 

Real-time  qRT-PCR  was  performed  for  several  genes 
(CCBE1,  CCDC11,  CD226,  NP_115536.1,  NP_689683.2, 
RNF152,  SERPINB8,  TCF4,  and  DYM).  The  ubiquitously 
expressed  DYM  gene  was  used  as  a  control.  The  subset  of  the 
cDNA  samples  that  were  used  in  the  SM  RT-PCR  experi¬ 
ments  and  additional  cDNA  samples  prepared  from  the  12 
matched  pairs  of  normal  and  tumor  breast  tissues  were  ana¬ 
lyzed.  The  same  primer  pairs  that  were  used  in  the  SM  RT- 
PCR  experiments  were  also  used  in  the  real-time  qRT-PCR 
experiments.  Using  the  Power  SYBR  Green  PCR  Master  Mix 
purchased  from  Applied  Biosystems,  the  reactions  were 
conducted  using  the  Mx3000p  system  (Stratagene)  under  the 
default  conditions,  except  that  the  annealing  temperature 
was  raised  to  60°C  instead  of  55°C.  Data  were  analyzed  using 
the  MxPro  software  installed  with  the  equipment,  and  the  Ct 
values  were  obtained  for  the  individual  reactions. 


3  Results 

3.1  SM  PCR  and  SM  RT-PCR  analyses  of  the  genes  on 
the  chromosomal  region  of  18q21-qter 

Using  control  genomic  DNA  template,  the  optimal  primer 
concentrations  were  determined  to  unify  band  intensity  as 
previously  described  [8-11].  Out  of  140  genes  on  the  chro¬ 
mosomal  region  of  18q21-qter,  13  genes  failed  to  amplify 
specifically  and  were  excluded  from  the  system.  Together 
with  4  genes  in  the  18ql2.3  region  that  neighbors  the 
18q21.1  region  and  2  genes  in  18ql2.2,  the  SM  RT-PCR  sys¬ 
tem  consisted  of  133  genes  in  12  sets  (Sets  A-L).  The  list  of 
the  genes  is  shown  in  Table  1.  The  nucleotide  sequences  and 
concentrations  of  the  primers  used  in  this  study  are  also 
shown. 

We  examined  the  copy  number  of  those  genes  in  breast 
and  prostate  cells  and  tissues  by  the  SM  PCR  method. 
Results  are  shown  in  the  left  column  of  Fig.  1.  We  found 
dozens  of  genes  with  decreases  in  band  intensity  in  cancer 
cell  lines,  most  evidently  in  MCF7  cells.  Complete  dis¬ 
appearance  of  band  was  observed  with  the  SMAD4  gene  (Set 
B)  in  the  MDA-MB-468  cells,  and  a  drastic  decrease  in  band 
intensity  was  observed  with  ELAC1  (Set  C)  and  PLEKHE1 
(Set  H)  in  MDA-MB-468  and  RAX  (Set  F)  in  BT-20.  We  also 
observed  an  increase,  which  is  suggestive  of  gene  amplifica¬ 
tion,  in  the  SLC14A1  gene  (Set  E)  in  MCF7  among  others. 


Band  intensity  was  densitometrically  measured  for  the 
SM  PCR  bands  and  the  values  were  input  into  the  table  of 
genes  aligned  based  on  their  chromosomal  locations.  The 
partial  results  are  shown  of  a  normal  breast,  primary  culture 
of  normal  mammary  epithelial  cells,  and  three  mammary 
carcinoma  cell  lines  in  gray  scale  (black  and  white  for  the 
lowest  and  highest  band  intensity,  respectively)  in  the  left 
column  of  Fig.  2. 

We  next  examined  the  expression  of  those  133  genes  in 
breast  and  prostate  cells  and  tissues.  Results  are  shown  in 
the  right  column  of  Fig.  1.  The  PCR  conditions  were  elabo¬ 
rated  so  that  small  amounts  of  genomic  DNA  would  produce 
bands.  The  absence  of  at  least  one  band  indicated  the  least 
amount  of  contaminating  genomic  DNA  in  the  cDNA  speci¬ 
mens.  We  found  that  approximately  40%  of  the  genes  were 
ubiquitously  expressed  in  a  large  amount.  Similar  band 
intensities  of  the  ubiquitously  expressed  genes  in  the  speci¬ 
mens  suggested  comparable  quality  and  quantity  of  the 
cDNA  preparations.  We  also  observed  that  17  were  not  tran¬ 
scribed  in  either  normal  or  cancerous  breast/prostate  cells/ 
tissues.  The  other  genes  were  expressed  in  some,  but  not  all, 
of  the  cDNA  samples  examined.  Among  them,  several  genes 
showed  interesting  expression  profiles.  For  instance,  the 
expression  of  CCDC11  (Set  C)  was  completely  repressed  in 
three  out  of  five  breast  and  three  out  of  three  prostate  cancer 
cell  lines  that  were  examined,  whereas  it  was  expressed  in 
normal  cells  and  tissues.  On  the  other  hand,  the  expression 
of  the  RNF152  gene  (Set  A)  was  repressed  in  prostate,  but  not 
in  breast,  cancer  cell  lines.  Additional  genes  that  exhibited 
down-regulation  of  gene  expression  include  TCF4  (Set  E), 
NP_689683.2  (Set  F),  SERPINB8  (Set  G),  CCBE1  (Set  I), 
CD226  (Set  K),  and  NP_115536.1  (Set  K)  genes  in  a  majority 
of  the  cancer  cell  lines  and  PSTPIP2  (Set  A)  and  KIAA0427 
(Set  H)  in  a  minority  of  the  cancer  cell  lines  that  were  exam¬ 
ined. 

For  quantification,  band  intensity  was  measured  for  the 
SM  RT-PCR  bands.  Together  with  genomic  DNA  as  a  control, 
the  results  of  a  normal  breast,  a  normal  prostate,  primary 
culture  of  normal  mammary  and  prostate  epithelial  cells, 
and  five  mammary  and  three  prostate  cancer  cell  lines  are 
schematically  shown  in  a  gray  scale  from  black  (weakest)  to 
white  (strongest)  in  the  middle  column  of  Fig.  2. 

3.2  Comparison  of  the  SM  RT-PCR  and  SM  PCR 
results  with  the  results  from  DNA  microarray 
hybridization 

We  compared  the  results  from  the  SM  RT-PCR  experiments 
with  the  results  from  the  DNA  microarray  hybridization 
experiments.  We  normalized  data  from  the  DNA  microarray 
hybridization  experiments  using  the  Beadscan  3.0  software. 
The  average  fluorescence  signal  intensities  of  beads  for  in¬ 
dividual  genes  on  the  18q21-qter  region  were  extracted  and 
gray-scaled,  and  are  shown,  side-by-side  with  the  data  from 
the  SM  RT-PCR  experiments,  in  the  right  column  of  Fig.  2. 
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Table  1.  Primers  used  in  the  study 


Gene 

Fragment 

Primer  1 

Primer  2 

Primer 

Name 

size  (bp) 

sequence 

sequence 

cone. 

(nM) 

Set  A 

ZNF532 

175 

CDH7 

160 

TNFRSF11A 

146 

MBD1 

133 

MBD2 

121 

DCC 

110 

RNF152 

100 

CXXC1 

91 

ONECUT2 

83 

MRO 

76 

PSTPIP2 

70 

Set  B 

CAD20JHUMAN 

175 

CDH19 

160 

SMAD7 

146 

RKHD2 

133 

BCL2 

121 

PMAIP1 

110 

SMAD4 

100 

MAPK4 

91 

SMAD2 

83 

RAB27B 

76 

PIK3C3 

70 

SetC 

ME2 

133 

ELAC1 

121 

NET01 

110 

CCDC11 

100 

CBLN2 

83 

C18orf24 

76 

SetD 

SERPINB2 

151 

ZNF407 

138 

GALR1 

126 

WDR7 

115 

GRP 

105 

ALPK2 

96 

LIPG 

88 

SLC39A6 

75 

GALNT1 

66 

SetE 

XP_372695.2 

192 

POLLHUMAN 

176 

Q96I\I33_HUMAI\I 

161 

Q8TCD1_HUMAN 

147 

Q7Z5E4_HUMAN 

134 

SLC14A1 

122 

TCF4 

111 

PIGN 

101 

FVT1 

92 

NP_001008240.1 

77 

TGAAGGGCCTCCAAACTTGGGTAT  AGGACTGGCCACTTTCTTGGTTTC  41 

CTGAGAAACCTCAACGTCATCCGA  CACCAG  G  AT  CAACATCG  G  CTT  CTT  205 

ATG  CC  AG  G  ATGCTCT  CATT  G  GTC  A  TGTGGATTTGCTTCCAGGCTCAGT  41 

TCCAACGAAGCAGGAAGCAGGT  C  AACAGG  G  CTT  CTGTG  G  AAG  CTG  102 

CCAGGTAGCAATGATGAGACCCTTT  TGTT  AAG  CCAAACAG  CAG  G  GTTCT  68 

ACT  ACCC  AACAACCACCT  ATG  CTG  AGTGGGTGAGTTGGTCGAACACAAG  102 

TGTCATCGCCATTCCACACACTTC  ACGCTCCTTGGAGATGGGCA  136 

TGTTTGAGCAGGAGCGCAATGT  GGATCGTGCTGGATCGTCTGGT  170 

GAACAAACGCCCGTCAAAGGAGAT  TGAAGAAGTTGCTGACGGTTGTGAG  102 

CTGGTGTATGGACTGTATGACCCTGTGA  CCAGAACGACGGTCAGAGTCTTCA  102 

TG  AGGCTCAAGAAT  GTG  AACG  AAT  AAACT  TGACAGCTGATTCACATGTAACCACAATGC  102 

TGAACAGCACTGTCCACAGCTA  AAGTCGAAGCTCTGTTCCGAGTCC  273 

AGGAGCCTATACAGG  CAGT  CTTT  G  ATCCAG  CT  AAT  G  ACCCTGTTCCCT  68 

CATCTTCATCAAGTCCGCCACACT  GCTG  CAT  AAACTCGT  G  GT  CATT  G  G  85 

CAAGACGAAAGCACGACTGTGTGA  ACTGGACATGATGGCGTTCTCT  55 

AGCATGCGGCCTCTGTTTGATTTC  AG  G  CAT  GTT  G  ACTTCACTT  GTG  GC  68 

G  AAT  CT  GAT  AT  CCAAACT  CTTCT  G  CTCAG  G  TCAAATTG  ATG  AAACGTGCACCTCCT  102 

GCTGCTGCTGGAATTGGTGTTGAT  TGATGCTCTGTCTTGGGTAATCCG  102 

GACCACGACAACATCGTCAAAGTG  TGTACGCCACGCTGAACTTGAACA  102 

AGCTTCACCAATCAAGTCCCATGA  AACAGT  CCATAG  G  G  ACCACACACA  55 

GCGAATGGAACAGTGTGTGGAGAAGA  CCCATCCAAGTTTCCAGAATTTCCACC  41 

CAGAGTCTGATTGATGAGAGTGTCCATGC  GGGCAAACTTGTGAATCTGTTCCACC  41 

GGAGAGAGAATTCTGGGTCTTGGAGA  TT  CCCACATC  AAT  ACACACT  G  G  CAG  G  100 

AT  G  G  ACAAAGC  AAAG  G  AGCAT  G  G  C  AG  G  CAACT  G  GTTTGTACCTCT  G  AC  100 

AG  G  AG  CT  ACAG  CT  G  ACTTT  G  CAG  ATG  ACAGCT  GTG  ATCCACAGTG  ATGGT  100 

AGGAGCGGAAAGCACAGATTGCAT  GCTAATCGGTCTTCCTCCCAGAGTTT  125 

GCTCATGGAAAGGGAAGACAAAGTGC  CCGAGAATGTGGAGTATTTCCAGCC  125 

AGAAGCCTCCCAAAGAGCAAAGA  GAAGGAACACCATTGAACTCATCACAAGT  250 

TT  CAACAAG  G  G  ACG  G  G  CC  AATTT  C  GTCCAGTTCTCCCTGTCATAACACCT  83 

CTGTACTCCCACACCGTGCT  GGAGCCCTCCTGGGTGTAGATGA  125 

ACAGACAGTTCTGGTGGTGGTTGT  GCGGTGATTCTGAAGAGGAAGGAA  67 

CACCTCAGCTGCGCTGCATT  AAA  ACGTTGCGGTTGGAAGTCCAGAT  166 

GAAGCTGCAAGGAATTTGCTGGGT  TGAATCCCACGAAGGCTGCTGATT  83 

AGTTT  AAAG  CACT  ACACCAGT  GT  AACAAGT  CAATGCTCGGCTGCTTCTGTTTCT  166 

ACTT  G  G  G  AG  ACCT  CTTG  AAG  ATCCAG  AGGTAGCTGCGAAACTCCTTCCA  83 

GTTCTACTAAAGGCTGGCATGACCGT  AAGATACGCCAGCATGGCTGACAA  166 

TCAGTGCCTGGATAAAGCCACAGAAG  CTTCCATTGCAGTCTCTAATGCTGGG  125 

GCATACCCAAGGACAAGGCCATTA  TG  CG  G  GCT  ACACAAG  ATT  GATT  CC  58 

TCTCCTTGTGAACCGGGAACATCA  AAAGCAGACACAGCAGGGTTTGAA  115 

TG  G  CCATG  ACCCACAT  G  AG  G  ATTT  TG  AAACAGAAGATG  AACTTG  AT  GACCAGGG  58 

GGTGCAGACATGAATGGATTACCAACA  AGGTCAGAGACAATTACAAGGAAGATGC  87 

ATGCTCTGGCGTCTACTGCATTTC  GACACCCAATTCCTTCACTTGCCA  87 

CTTCTGTTTGGCCACGCTATTGTTCC  TCTTGGCTTGCAGGTAGAAGATGC  115 

ATGAGGACCTGACACCAGAGCAGAA  GCTCTTTGAAAGCCTCGTTGATGTC  144 

TTT  G  GTGTT  CCTCAATGG  CCT  G  G  TG  GTG  CTT  CAG  CAACCTCACAT  115 

GGTGGGCAGGATCGTGTTTGTGT  CCCTT  AT  G  GC  AAACTT  G  G  ATG  CAG  115 

GTTAGCACCAGGCAAACACAGTCA  TCAACAACTTCTGCTGGTGAAGCC  115 
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Table  1.  Continued 


Gene  Fragment  Primer  1  Primer  2  Primer 

Name  size  (bp)  sequence  sequence  cone. 

(nM) 


SERPIIMB7 

KATNAL2 

SEC11L3 


71  AG  ATT  CTT  G  AG  CTCAG  AT  ACAAT  G  GTG  GC  TTCAGAGAGGTCATTCTCAGGCAGC  87 

66  ACTGCCGACTTTCTGGATGTGCTA  GTATCTCTGAGCCAGATTCTTTGCGG  87 

62  GAGGCTTGTACAAAGAAGGCCAGA  TCTTGCTCTTCCCACCACGTCCTT  115 


Set  F 

XP_1 13971.3 

193 

ENSG000001 88451 

177 

TCEB3C 

162 

NP_005594.1 

135 

VPS4B 

123 

STARD6 

112 

KIAA1468 

93 

NP_998767.1 

85 

TXNL1 

78 

RAX 

72 

SERPINB10 

67 

NP_689683.2 

63 

AAATCCACATGCGGAAGCACACAG 

CACACCCTGACACAGCTTATTTCTGC 

TCTGGCCACTAAGACGGAGCCGAAA 

CAGAAGCATCGCAAGCGGTTGAA 

TAGGGACCACTCAGAACAGTCTCA 

AAACCCAGCATATTCCAAACTAGTGATGT 

TTGCTGCAAGCTTAGTGAGTGAAGA 

TCCCTCAGCCACCAACATC  C ATTT 

GTTCAGTCGAATCAAGGTGAAGAGGA 

GACAAGTTCCCGCTGGACGAGG 

ATCCAGAATCTCCTGCCTGATGAC 

AGTGGCTCGCCATGAGCAAGAAAT 


CAGCTCTGGCGCTTGATGTGGC 
ATCTCCAGTAACTTTGCCACCCTTC 
GCCGCTAAGTCTCTGGCAAAGT 
GGAGGAGATGAGGTCCGCGTAG 
TCCTAACAG  G  CTG  CATAAG  G  G  CAT 
GAGGATGAAGTTTACTAAGTTGGAAGGCA 
AT  ACATTGG  CCAACAT  G  GCACCTG 
GAGAACCAGGACTGGCTGTGCC 
TGTTGCCTGGACTGGAGTACCAAT 
CTCCTTG  G  CTTTCAG  ACG  CAGC 
ATAGGGCGTTCACCAGAATCATCC 
CTCCCAGTTGTGTCT  CAAT  GT  CCACT 


188 

63 

38 

125 

63 

94 

125 

94 

125 

188 

94 

94 


Set  G 

ZCCHC2 

194 

XP_371118.1 

178 

C18orf12 

163 

L0XHD1 

149 

NP_066015.1 

136 

CPLX4 

124 

HDHD2 

113 

C18orf54 

103 

SIA8C_HUMAN 

94 

ACAA2 

86 

SERPINB12 

73 

SERPINB8 

68 

NEDD4L 

64 

GAACACGAACGCTAATGGGACAGT  TGCCATTGCAAATGGATGGCAGAG 

GATCAACGAGGAAAGCGACTACCA  C  AAG  GCTT  CATT  CTCTCG  CTG  G  AA 

TT  CCT  CCAACTGCAT  CG CTCAATC  CTCCCACTTTCAG CATTCTG G  CTT 

TCTTTAACTGTGACTGCCTCATCCC  CTCATAGCCTGTTGTCACGATGACTT 

GTGGAGGAAGAGG  CAAAGCTGTTT  GGTTCTG  CCTG  CTCTG  AACCAAG  A 

TGGCTGGAGATGATGTGGATTTACC  T  CC  AAGT  CCAT  GTTCTG  GAG  ATT  CT  G 

G  CACCTCTG  ATAG  CAATCCACAAAGC  GTGGCTTTGGTATCTGTGGCATACTC 

GCCAAGAGAAATCTAGAGCAGTGTACTGAA  CCC  ATG  ATCTG  TCTG  CTTC  AAG  TTTATCT 
ATTT  ACCACCAAGT  G  G  CAG  G  AGTC  CAGAGTCAGCTTGGTGAGCCCTT 

CCCATG G  CAATGACTG CAG AG AATC  TG  CTGTGACTG CAG  G  G  CATATT 

G  G  CACAG  AT  CCTG  G  AAAT  G  AG  GTA  TTTAGAGTGAGATGGCAGCAGCAC 

CAG  AAGTT  CT  AT  CAG  GCAGAGCTGGA  TG  CTTCCTGC  ACTCTTCAGTGTCT 

TCGCCTTGACTTACCTCCATATGAAACC  CACGGCCATGAGAAGTTTCTCTCGTA 


87 

87 

87 

87 

87 

87 

46 

87 

173 

58 

58 

115 

87 


SetH 

PLEKHE1 

195 

SIA8EJ1UMAN 

179 

MALT1 

164 

C18orf26 

150 

KIAA0427 

137 

MC4R 

125 

MY05B 

114 

ATP5A1 

104 

LMAN1 

95 

SERPINB13 

87 

FECH 

80 

IER3IP1 

69 

C18orf20 

65 

AAG  GAG  AAG  G  AG  AAACAG  CAG  CACCT 

TTCCACAAGCTGGAGAAGTGGC 

GAGGACAAGCAGGAAGTGAATGTTGG 

TG  G  CTTGTCTCTTAG CCTGTGTGA 

ACAGCTGCCTG AGAT GATG ACAGA 

GACTCTG GGTGTCATCAG CTTGTT 

TTCACCAGAGTGGAGCAGTTCAGA 

AAGTGGCTGTTATCTATGCGGGTG 

AACCGTCAGACTGGTCAGTGGAAT 

GATGGCTCTATTAGTAGCTCTACCAAGCTG 

ATCCAGTCAAACGAGCTGTGTTCC 

GTGGATTTGGAGAAGAGCCGGGAAT 

AGGTTACAACCAGAGACCTGAAGGA 


CACCTCT  AT  CACATT  GTG  G  CTCCT 

TACTGCGGATGGAAGTAGTAGACAGC 

T  G  CAATG  AGT  GAT  AATGCCCT  G  CTCC 

AG  G  GTG  TTCCAG GTTTG ACAGT 

GCGTCAGAGGGTTCCAGCTGTTAG 

TCACCAGCATATCAGCCACAGCCAA 

AGGTACACAGGGAGCAGATAGCCT 

CTGGCTGACGACATGAGACAAGAA 

GCAGGTGCTCTTTGATGTCAATGAAGTG 

TTCTTTCTTAAACTCCCTGTCCCATTGCC 

TTAGTCTCCCTGCAGACAGGATTGAC 

TCACG  GTTCTTACAG  ATCG  AATAAG  GTTCA 

TCCTCTTCTCTTTG  ACAACCATGTG  G  C 


115 

144 

46 

46 

58 

46 

87 

58 

87 

115 

115 

144 

173 


Set  I 

ENSG000001 82288  232 

Q8I\I7F0_HUMAI\I  214 
Q9H380_HUMAI\I  197 


CCBE1  181 

NP_079490.1  152 

NARS  139 

PIAS2  127 


TG  CTGCTC  AAG  CT  CCATT  CACG  A  GCCAGAGACAGGCATGAGAGATCAAA  50 

TTT  G  G  AAAG  GAACATAAATG  ACCT  GACAG  A  TTTAGGTCTTTGAGAAATTGCCACAGTGTT  94 

CCT  G  CAT  G  G  ATTTGC  AT  GTTTCCC  AGGTTAGAAGAATAAGAAAGGGAGTCTGGA  125 

CCTGGTTCTTTCGACTTCCTGCTA  TTCTTGGATGGTCATCTCCAGAGCC  63 

GGAAATGAACTGGCTGGATGAAGATCTGA  T  CATCTTCTT  G  CAAAG  CCATT  G  GTATGT  188 

AAGGTGGTGCCACACTCTTCAA  TGCTCTGCCCGGTATGACTGAGCAAT  94 

ACCCTTAACAGCAAGCAGTACGTC  G  TC  AG  G  AAT  GTTACTTCCACTGCTGG  94 
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Table  1.  Continued 


Gene  Fragment  Primer  1  Primer  2  Primer 

Name  size  (bp)  sequence  sequence  cone. 

(nM) 


SERPIIMB4 

116 

DYM 

106 

CCDC5 

97 

ENSG000001 41 690 

89 

C18orf24 

82 

Set  J 

NP_055728.1 

235 

SALL3 

200 

EIMSG00000196512 

169 

SOCS6 

155 

PQLC1 

142 

CNDP2 

130 

Q8I\I8S9_HUMAN 

119 

NP_997344.1 

100 

NP_872376.1 

92 

NP_079057.1 

85 

ENSG000001 82671 

79 

ENSG000001 76594 

74 

CYB5 

70 

Set  K 

NP_115536.1 

185 

ZADH2 

170 

ZNF236 

156 

CD226 

143 

TXNL4A 

131 

ATP9B 

120 

PARD6G 

101 

GTSCR1 

93 

CTDP1 

86 

Q9I\IY04_HUMAN 

80 

CNDP1 

75 

FBX015 

71 

SetL 

SDCCAG33 

219 

TXNDC10 

171 

NP_079081.1 

132 

Q96MY0_HUMAN 

121 

NP_997343.1 

111 

NP_054896.1 

102 

XP_058931.6 

87 

RTTN 

81 

MBP 

72 

TG  G  AAG  AG  AG  CTATG  ACCT  CAAG  G  A  TTTAGATACTGAGAGACCGTGGCTCC  94 
CAAATATGTGGAAGAGGAGCAGCCC  CTGGATGTCCTGTGGATTCCAGT  94 
TCAAGACCTTCTCATGGAGAGTGTGAA  G  G  CCACCG  CACT  GT  CAACCAAA  125 
TTGGACTTCCATCATCCTCATCAACTACT  AACTGTTCCAGAGATTCAGGGTGG  94 
ACCCGTAAAGAAGCCTCCCAAAGA  AG  G  AACACCATT  GAACTCATCACAAGT  94 

ACTGGTGCCTGTGTATGTGAAGGT  AGCCAGGGTCATATTTCCCGTGTA  46 
ACAACGAGATCTCCGTCATCCAGA  ATCCTCGATAAACCGCGTGAATGG  87 
AACCCAGAATGCCTCTTCTCCTCT  AG  CTTCCTTGCACT  G  G  G  CTATAAG  58 
TGGATCAGTCCGTGAATGGCTTGT  TGTGCCAGTGAGTCCACTGAAGTT  46 
CAGTGGAGCAGCTTCTCGGACTACG  CCAGCATGGCTTCGGTCAGCACAG  115 
GTTGAGCCAGACTTGACCAGGGAA  TTTCATTCTGGGAGTGGGCTCCGT  87 
CTCTTGTTCCCAGGCCCATCCAGC  AATCCGAAGGAGGTTCAGGGACTG  173 
TAATCATTGGCTGCCTCCACTCCA  CTTGACGGCTGTCATCAAACAGGT  87 
AAAGAGGGAGAGAAGGAACCAGGCT  AGGAACCTGGCCCTTCGGAAGTCT  87 
CTGAATCAGATCCGTAAGCTCCAGAGG  GCAAGTGCCT  GAGTT  CAGTCTCTAAGT  87 
ATT  G  CCAG  AAAG  AACCT  G  GCTTG  C  AGCAAGCCTAATGAAGAGGCTCCA  58 
AATACTTCCTTGGTCTGTTGGGCCAT  TCAGAGCCAGCTGCTTAAGGAATGTG  58 
AAGCTGGAGGTGACGCTACTGAGAAC  TT  G  G  ACATTTCCCT  G  G  CAT  CTGTAG  87 

G  CCTTT  CTT  G  G  AATTCCTTT  GTCTCCTG  C  CGATCCATCAGAGTCCAGCAGATGTT  63 
TTAAGCAGGAGTACCCTGAAGGTGTC  CAATGTTCCTGCTTTCACAGGCGA  94 
CGGCCGTTCCATTGCACGCTTTGT  CCGCTTCATGTGCAGCTTCATGTT  50 
GCCACATTGTTTCGGAACCTGGAA  TCTGCCATGGACCAAGTTGCAGTA  63 
AC  AAG  ATT  AACT  G  G  G  CCAT  G  G  AG  G  TCAGTAGCGGTACTTGGTGGAGTA  94 
AT  CTCCTT  CACCG  CACTG  AT  CCTG  A  G  AGTGAG  GACACGTAGCAGCCT  AA  125 
AATGACGAGGTCCTGGAGGTGAAC  GTGACGATGAGGTTGTGGCTGTTG  312 
ACTCAT CTACTG  CAAG  CTT G  GCCC  ACT GACCATAG  AG  ATGGT AGTGAT GTCT  63 
CAGATGTTTGGTGAAGAGCTGCCT  TCAGACATACTGGGCTGTCGCTTT  94 
CCAAACTGCCATTCCAGTCACTCA  CCTAGTAGAACAAAGAAAGCCCTGGAA  125 
ATGATCCGGGATGGATCCACCATT  AATT  AGCACCACGCTCTTGTGGAC  94 
ACCCTCTG  AC  AG  CT  CT  AG  CTT  CTT  G  CACT  CTT  CCTT  CCGCAT  CAACGT  A  94 

GCAACGATTGTGCCTCTCAGTTCA  TCCGGTTGCAGAGCTTACATTGGA  83 
GTGCTATGGAATCTACACAGCCGA  AAT  ACATCCTTGGGCTCCTGCACT  67 
CAGCTCCCTCAAGAGTTACCTGTCA  TCTGTTCTGCCACCTCCTCTCTCT  125 
AAAGGTGCCATGCCAGAGAGATGA  AGGACAGAAGCAGTTTGCTGATGC  83 
CAATCCTGGCGGTTACCTCAGCGG  CAGCGCGTCTGGAGTAGTTTCTTT  125 
CGTACAGTATACGGAGAAGCTGCACA  CTCAAACTGGGCTCAGTCTTCAAGCA  167 
TGGTGGCTATTGATGTGGACATGG  TGCAGTCTTCATCTCCTGTGCAGT  83 
CCCAAACTCAGAAGCAAACCCTCT  CAGGAAGAATTAAGGAGCTGCACGAG  167 
AAGGCCAGAGACCAGGATTTGGCTA  CCTTGAATCCCTTGTGAGCCGATT  208 


We  also  compared  the  results  from  the  SM  PCR  experi¬ 
ments  with  the  results  from  DNA  microarray  hybridization 
experiments.  We  had  not  established  the  arrayCGH  system 
in  our  laboratory,  so  we  outsourced  the  arrayCGH  hybridi¬ 
zation  experiments  to  NimbleGen,  one  of  the  pioneering 
providers  of  commercial  services  of  the  technique.  We  sub¬ 
mitted  the  same  genomic  DNA  from  MCF7,  MDA-MB-468, 
and  BT-20  breast  cancer  cell  lines  that  was  used  in  our  SM 


PCR  experiments.  Data  for  the  probes  corresponding  to  the 
sequences  on  chromosome  18  were  extracted,  and  log2 
values  of  relative  fluorescence  intensity  to  normal  female 
genomic  DNA  were  plotted  on  the  Y-axis  along  the  chromo¬ 
somal  location  on  the  X-axis  (pter  to  qter  from  left  to  right). 
Results  are  shown  in  the  upper  panel  (a)  of  Fig.  3.  Similarly, 
log2  values  were  calculated  of  relative  band  intensity  of  those 
cell  lines  to  normal  female  breast  from  the  SM  PCR  data  in 
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Figure  1.  SM  PCR  and  SM  RT-PCR  results  of  breast  and  prostate  cells  and  tissues.  The  left  and  right  panels  show  the  results  of  the  SM  PCR 
and  SM  RT-PCR  experiments,  respectively.  There  are  a  total  of  12  sets  (A-L).  SM  PCR  and  SM  RT-PCR  were  performed  to  examine  copy 
number  and  expression  changes  in  breast  and  prostate  cancer  cells  and  tissues.  The  sources  of  genomic  DNA  and  cDNA  are  abbreviated:  a 
normal  sample  (NB),  primary  tumor  (TB),  and  metastasized  tumor  (MB)  of  breast  tissue  from  an  individual;  a  normal  sample  (NP),  and  pri¬ 
mary  tumor  tissues  (TP)  of  prostate  from  an  individual;  a  normal  prostate  tissue  (NP)  from  a  third  individual;  a  hyperplastic  prostate  tissue 
(HyP)  from  a  fourth  individual;  primary  cultures  of  normal  mammary  (MP)  and  prostate  (PP)  epithelial  cells;  and  MCF7  (MCF),  MDA-MB-468 
(468),  MDA-MB-231  (231),  BT-20  (BT),  T-47D  (T47),  Hs-578T  (578),  PC3  (PC),  DU145  (DU),  LNCaP  (LN),  and  MDA  PCa2b  (PCa)  cancer  cell  lines. 
The  locations  of  the  DNA  fragments  amplified  from  the  individual  genes  are  also  shown  at  the  left  side  of  the  gel  pictures.  The  symbol  M 
denotes  DNA  fragment  size  markers,  and  the  symbol  G  shows  the  results  of  genomic  DNA  control  in  the  SM  RT-PCR  experiments. 
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Figure  2.  Intensities  of  the  bands  amplified  by  SM  PCR  and  SM  RT-PCR  and  the  intensities  of  fluorescence  detected  after  DNA  microarray 
hybridization  of  the  genes  on  chromosomal  region  of  18q21-qter.  Data  that  were  obtained  by  the  SM  RT-PCR  and  SM  PCR  experiments  that 
are  shown  in  Fig.  1  were  used  to  prepare  this  table  by  the  densitometry  measurement  of  band  intensity.  The  average  band  intensities  of 
individual  gels  were  adjusted  to  normalize  the  values.  The  partial  results  of  the  SM  PCR  and  SM  RT-PCR  experiments  are  shown  in  the  left 
and  center  columns,  respectively.  The  values  of  the  band  intensities  of  the  PCR-amplified  fragments  were  aligned  by  their  chromosomal 
locations,  and  are  shown  in  gray  scale,  with  white  as  the  strongest  and  black  as  the  weakest.  Data  on  fluorescence  signal  intensity  were 
extracted  for  the  genes  in  18q21-qter  from  the  DNA  microarray  hybridization  results,  normalized,  and  aligned  (shown  in  the  right  column). 
The  negative  values  obtained  by  microarray  hybridization  were  recorded  as  zero  in  the  table.  The  gene  names,  cytobands,  the  start  and  end 
of  the  gene  locations,  and  the  primer  sets,  are  also  shown. 
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Figure  3.  Comparison  of  the  SM  PCR  results  and  the  arrayCGH  results.  Data  for  the  genes  on  chromosome  18  were  extracted  and  nor¬ 
malized  from  the  arrayCGH  experiments  of  three  breast  cancer  cell  lines,  MCF7,  MDA-MB-468,  and  BT-20.  The  log2  values  were  calculated 
and  plotted  on  the  Y-axis  with  the  chromosomal  location  on  the  X-axis  in  (a).  The  rightmost  end  of  the  X-axis  corresponds  to 
80  000  000  base  pairs  from  the  pter.  Since  there  are  only  76  117  153  base  pairs  in  the  chromosome,  there  is  a  gap  between  the  qter  of  the 
chromosome  and  the  ends  of  the  graphs.  The  relative  band  intensity  was  calculated  by  dividing  the  band  intensity  values  shown  in  Fig.  2  of 
the  genes  for  normal  mammary  primary  cells,  MCF7,  MDA-MB-468,  and  BT-20  breast  cancer  cell  lines  by  the  corresponding  values  for  a 
normal  breast  tissue.  The  log2  values  were  then  used  to  plot  on  the  graph  in  (b). 


Fig.  2  and  were  plotted.  Results  are  shown  in  the  lower  panel 
(b)  of  Fig.  3.  Except  for  the  number  of  data  points,  the  graphs 
were  similar  between  the  SM  (RT-)PCR  and  arrayCGH 
results.  Both  showed  that  the  copy  number  was  constant  over 
the  18q21-qter  chromosomal  region  in  MDA-MB-468  and 
BT-20  cells,  whereas  there  were  at  least  two  changes  in  copy 
number  in  the  region  in  MCF7  cells:  one  around  46  Mb  and 
the  other  around  60  Mb  from  pter.  Additionally,  homozygous 
deletion  of  SMAD4  gene  was  recognized  by  those  two  meth¬ 
ods. 

3.3  Comparison  of  the  real-time  qRT-PCR  results  with 
the  results  from  the  SM  RT-PCR  and  DNA 
microarray  hybridization 

To  confirm  our  findings  by  the  moderately  high-throughput 
SM  RT-PCR  and  high-throughput  DNA  microarray  hybridi¬ 
zation  expression  analyses,  we  next  performed  real-time 
qRT-PCR  for  the  promising  candidates  of  cancer  genes: 
CCBE1,  CCDC11,  CD226,  NP_115536.1,  NP_689683.2, 


ONECUT2,  RNF152,  SERPINB8,  and  TCF4.  As  a  control,  we 
examined  the  expression  of  the  DYM  gene.  This  gene 
encodes  Dymeclin  (Dyggve-Melchior-Clausen  syndrome 
protein)  [15]  and  both  the  SM  RT-PCR  and  the  DNA  micro¬ 
array  hybridization  experiments  showed  ubiquitous  expres¬ 
sion  in  large  quantity  for  all  of  the  cells  and  tissues  we 
examined.  The  log2  values  of  band/fluorescence  intensity 
were  calculated  for  SM  RT-PCR  and  DNA  microarray  hybri¬ 
dization,  using  the  data  in  Fig.  2  and  plotted  against  the  Ct 
values  obtained  by  real-time  qRT-PCR.  The  results  were 
compared  and  are  shown  in  Fig.  4. 

Figure  4  clearly  demonstrates  a  better  correlation  be¬ 
tween  the  results  of  SM  RT-PCR  and  real-time  qRT-PCR  than 
between  the  results  of  DNA  microarray  hybridization  and 
real-time  qRT-PCR.  This  is  reasonable  because  both  SM  RT- 
PCR  and  real-time  qRT-PCR  are  PCR-based  techniques  and 
the  same  pairs  of  primers  that  were  proven  useful  in  the  SM 
RT-PCR  were  used  in  the  real-time  qRT-PCR  experiments. 
Based  on  these  results,  we  concluded  that  the  differences  in 
gene  expression  that  we  observed  by  SM  RT-PCR  were  real. 
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Figure  4.  Correlation  between  the  band  intensity  obtained  from  the  SM  RT-PCR  or  fluorescence  intensity  obtained  from  DNA  microarray 
hybridization  and  the  Ct  values  obtained  from  the  real-time  qRT-PCR  experiments.  The  log2  values  of  the  band  intensity  (closed  circle)  or 
fluorescence  intensity  (open  triangle)  were  plotted  along  the  Y-axis  against  the  Ct  values  on  the  X-axis.  The  DYM  gene  was  used  as  a  con¬ 
trol  because  this  gene  was  ubiquitously  expressed  in  large  quantity  in  all  the  cells  and  tissues  that  were  examined  in  both  the  SM  RT-PCR 
and  the  DNA  microarray  hybridization  experiments.  Negative  and  zero  values  obtained  by  microarray  hybridization  experiments  were 
assigned  the  value  of  0.1  for  these  graphs.  The  portion  of  the  DYM  results  was  enlarged  and  is  shown  in  the  right  graph  on  the  top  row. 


3.4  Real-time  qRT-PCR  of  clinical  specimens  of  breast 
cancer 

As  the  next  step,  we  performed  real-time  qRT-PCR  using 
cDNA  prepared  from  clinical  specimens  of  breast  cancer. 
Cancer  cell  lines  provide  a  useful  starting  point  for  the  dis¬ 
covery  and  functional  analysis  of  genes  involved  in  cancer. 
The  alterations  found  in  cancer  cell  lines,  however,  may  not 
necessarily  be  present  in  the  original  tumors.  Those  changes 
may  have  been  acquired  during  a  long  cultivation  in  vitro. 
Therefore,  it  was  necessary  to  evaluate  whether  the  same 
differences  are  also  observed  in  clinical  specimens  of  cancer 
in  addition  to  cancer  cell  lines.  We  did  this  using  cDNA  pre¬ 
pared  from  matched  normal  and  tumor  pairs  of  breast  tis¬ 
sues.  The  expression  of  the  DYM  gene  was  used  as  a  control 
to  normalize  the  expression  levels.  The  subtractive  Ct  values 
were  plotted  of  the  matched  normal  (on  X-axis)  and  tumor 


(on  Y-axis)  pairs  of  breast  tissues  and  the  partial  results  are 
shown  in  Fig.  5.  Downregulation  of  gene  expression  was 
observed  with  the  CCBE1,  NP_115536.1,  NP_689683.2,  and 
TCF4  genes  in  a  majority  of  clinical  cases  of  breast  cancer 
(11,  9,  9,  and  11  out  of  12  cases).  A  reduction  of  greater  than 
50%  was  observed  in  8,  5,  7,  and  9  cases,  respectively. 

4  Discussion 

Using  a  model  SM  (RT-)PCR  system  that  contained  genes 
from  autosomes  and  the  X  chromosome,  we  previously 
demonstrated  that  less  than  a  twofold  difference  in  copy 
number  could  be  detected  by  SM  PCR  [11].  In  the  present 
study  we  applied  SM  PCR  and  SM  RT-PCR  to  examine  the 
changes  in  copy  number  and  expression  of  more  than  a 
hundred  of  genes  in  breast  and  prostate  tumors  and  cancer 
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Figure  5.  Expression  of  the 
selected  genes  in  matched  nor¬ 
mal  and  cancer  breast  tissues. 
The  gene  expression  was  deter¬ 
mined  for  the  CCBE1,  TCF4, 
NP_1 15536.1,  and  NP_689683.2 
genes  by  real-time  qRT-PCR 
using  cDNA  prepared  from  12 
matched  normal  and  cancer 
breast  tissues.  The  expression 
of  the  DYM  gene  was  used  to 
normalize  the  expression  data. 
The  subtractive  Ct  values  (minus 
Ct  DYM)  of  normal  tissues  are 
mapped  on  the  X-axis  and  those 
of  the  corresponding  tumor  tis¬ 
sues  from  the  same  individuals 
are  on  the  Y-axis.  The  line  y=  xis 
also  shown.  The  dots  above  the 
line  indicate  downregulation  in 
tumor,  whereas  dots  below 
indicate  upregulation. 


cell  lines.  The  total  number  of  genes  analyzed  by  SM  (RT-)- 
PCR  was  133  and  exceeded  the  number  of  genes  analyzed  in 
any  one  of  our  previous  studies.  We  focused  on  the  genes  on 
chromosomal  region  of  18q21-qter  because  loss  of  this 
region  has  been  repeatedly  observed  in  breast  and  prostate 
cancers,  and  tumor  suppressor  genes,  whose  inactivation 
could  contribute  to  the  development  of  breast  and  prostate 
cancers,  have  yet  to  be  identified. 

We  observed  a  striking  increase  in  band  intensity  with 
the  SLC14A1  gene  in  MCF7.  SLC14A1  is  a  member  of  SLC14 
gene  family  of  urea  transporters  [16].  Although  the  copy 
number  increase  of  the  SLC14A1  gene  is  associated  with  an 
increased  gene  expression,  this  difference  is  unique  to 
MCF7,  and  therefore  does  not  seem  to  be  common  phe¬ 
nomenon.  Both  the  SM  PCR  and  arrayCGH  methods  failed 
to  detect  the  sequences  of  the  SMAD4  and  ELAC1  genes  in 
the  MDA-MB-468  cell  line.  This  is  in  line  with  the  expression 
studies  since  scarce  or  no  expression  is  found  in  the  SM  RT- 
PCR  and  microarray  experiments,  suggesting  homozygous 
deletion  of  the  genes.  Homozygous  deletion  was  previously 
reported  of  the  SMAD4  gene  in  pancreatic  cancer  [4]  and  of 


the  ELAC1  gene  in  a  lung  carcinoma  cell  line,  Ma29  [17].  The 
results  differed  between  the  two  methods  with  the  RAX  and 
PLEKHE1  genes.  RAX  is  a  paired-type  homeobox  gene,  and 
the  CpG  island  associated  with  the  RAX  gene  promoter  was 
found  methylated  in  melanoma  [18].  We  observed  con¬ 
siderably  decreased  signal  only  by  the  SM  PCR  method. 
However,  the  functional  significance  is  not  clear  because  the 
gene  was  rarely  expressed  in  normal  and  cancerous  breast 
and  prostate  cells  and  tissues.  The  PLEKHE1  gene,  which  is 
also  called  PHLPP,  encodes  a  PH  domain  leucine-rich  repeat 
protein  phosphatase  that  specifically  dephosphorylates  the 
hydrophobic  motif  of  Alct  protein  kinase,  promotes  apopto¬ 
sis,  and  suppresses  tumor  growth  [19].  An  additional 
screening  showed  that  MDA-MB175-VII,  another  breast 
cancer  cell  line,  also  failed  to  amplify  the  SM  PCR  band  from 
the  gene,  in  addition  to  MDA-MB-468.  However,  the  follow¬ 
ing  DNA  sequencings  determined  that  rather  than  a  homo¬ 
zygous  deletion  of  the  PLEKHE1  gene,  a  3-nucleotide  dele¬ 
tion  (GCA  at  nt  4743^-745)  in  those  cell  lines  in  one  of  the 
two  primer  sequences  used  in  the  amplification  was  respon¬ 
sible  for  the  disappearance.  The  difference,  which  causes  the 
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deletion  of  one  of  the  two  glutamines  at  amino  acids  1580- 
1581,  was  not  found  in  Ensembl  or  GenBank  SNP  databases 
and  its  functional  significance  remains  to  be  elucidated. 

The  first  difference  we  observed  between  the  SM  RT-PCR 
and  DNA  microarray  hybridization  experiments  was  that  the 
results  from  the  DNA  microarray  hybridization  experiments 
exhibited  a  tendency  to  scatter  to  the  lowest  and  highest 
extremes  unlike  the  results  from  the  SM  RT-PCR  experi¬ 
ments.  This  is  reasonable  because  a  wider  linear  range  of 
signal  detection  by  the  hybridization  method  allowed  the 
detection  of  stronger  signals  without  saturation  and  also  be¬ 
cause  PCR-based  SM  RT-PCR  could  detect  signals  from  rare 
transcripts  by  amplification.  The  second  difference  was  that 
there  were  eight  genes  whose  expression  was  not  detected  by 
the  hybridization  method,  but  was  detected  by  the  SM  RT- 
PCR  method.  For  some  of  the  genes  it  is  possible  that  DNA 
fragments  amplified  by  SM  RT-PCR  and  the  oligonucleotide 
probes  in  the  BeadChips  were  derived  from  alternatively 
spliced  different  exons  or  that  the  expression  was  too  low  to 
be  detected  without  amplification.  However,  we  suspect  that 
inadequate  probes  that  were  not  pre-tested  may  have  caused 
some  of  the  false-negative  results.  In  the  SM  RT-PCR 
experiments,  we  used  normal  genomic  DNA  template,  which 
contains  all  the  genes,  to  establish  the  system.  The  useless 
primers,  which  failed  to  amplify  the  expected  sizes  of  DNA 
fragments  or  produced  additional  bands,  were  excluded.  The 
problem  with  the  DNA  microarray  hybridization  method  is 
that  not  all  the  probes  in  DNA  microarray  have  been  tested 
for  their  utility. 

In  contrast  to  copy  number  analysis,  we  observed  more 
differences  in  gene  expression.  Decreased  expression  in 
tumors  and  cancer  cell  lines  was  observed  with  both 
known  protein-coding  genes,  as  well  as  uncharacterized 
potential  genes.  The  known  genes  include  coiled-coil  do- 
main-containing  protein  11  (CCDC11),  RING  finger  pro¬ 
tein  152  (RNF152),  T  cell-specific  transcription  factor  4 
(TCF4)  [20,  21],  cytoplasmic  protease  inhibitor  8  (SER- 
PINB8)  [22],  collagen  and  calcium  binding  EGF  domains  1 
(CCBE1),  CD226  antigen  precursor  [23],  and  PSTPIP2  that 
regulates  F-actin  bundling  and  enhances  filopodia  forma¬ 
tion  and  motility  in  macrophages  [24].  The  uncharacterized 
genes  include  NP_689683.2,  NP_115536.1,  and  KIAA0427. 
The  down-regulation  was  confirmed  for  CCDC11,  RNF152, 
TCF4,  SERPINB8,  CCBE1,  PSTPIP2,  and  KIAA0427  by 
microarray  hybridization,  whereas  the  expression  of  CD226 
and  NP_115536.1  was  undetectable  in  certain  normal  cells/ 
tissues,  in  addition  to  several  cancer  cells,  by  the  micro¬ 
array  hybridization  method.  The  result  of  hybridization 
was  not  available  for  the  NP_689683.2  gene.  Consistently 
increased  expression  in  cancer  was  only  observed  with  the 
ONECUT2  (OC2)  gene,  which  encodes  a  transcription  fac¬ 
tor  characterized  by  the  presence  of  a  single  “cut”  domain 
and  an  atypical  homeodomain  [25],  in  the  18q21-qter 
region  by  SM  RT-PCR.  The  increase  was  not  so  obvious  by 
microarray  hybridization,  possibly  because  of  its  low  level 
of  transcription. 


Real-time  qRT-PCR  confirmed  that  all  the  above-men¬ 
tioned  differences  in  gene  expression  of  the  cancer  cell 
lines  were  real.  However,  only  a  subset  of  the  genes  sur¬ 
vived  candidacy  after  real-time  qRT-PCR  of  clinical  speci¬ 
mens  of  breast  cancer.  Downregulation  in  expression  was 
observed  for  the  CCBE1,  TCF4,  NP_115536.1,  and 
NP_689683.2  genes  in  11,  11,  9,  and  9  out  of  12  breast 
tumors,  respectively.  The  results  also  showed  that  the 
CCBE1  and  TCF4  gene  were  the  most  promising  candi¬ 
dates  among  those  examined,  because  decreased  expres¬ 
sion  in  tumor  was  observed  at  the  highest  frequency  in 
breast  cancer  cases.  As  opposed  to  TCF4  whose  link  to 
breast  cancer  has  recently  been  suggested  [21],  little  is 
known  about  the  CCBE1  gene  and  protein,  except  that  the 
amino  acid  sequence  of  the  CCBE1  protein  predicts  the 
presence  of  a  signal  peptide  and  collagen  and  calcium 
binding  EGF  domains.  Because  these  domains  are  found 
in  some  of  the  extracellular  matrix  proteins,  the  loss  of 
CCBE1  protein  expression  may  result  in  changes  in  cel¬ 
lular  characteristics,  such  as  adhesion  and  motility.  Further 
study  is  underway  to  pursue  this  possibility.  The  difference 
in  gene  expression  was  not  so  obvious  with  SERPINB8  (7/ 
12  downregulated)  and  ONECUT2  (7/12  upregulated). 
Surprisingly,  more  tumors  were  found  to  exhibit  increased 
expression  than  decreased  expression  with  the  CCDC11 
(10/12  up),  CD226  (9/12  up),  and  RNF152  (8/12  up)  genes, 
in  contrast  to  the  decreased  expression  observed  with  the 
cancer  cell  lines.  Heterogeneity  in  cellular  constituency  of 
tissues  and  contamination  of  normal  cells  and  infiltrating 
lymphocytes  in  tumor  tissues  may  have  contributed  to  the 
results.  The  different  environment  that  surrounds  the  cells 
in  vitro  and  in  vivo  may  have  affected  the  gene  expression. 
However,  the  discrepancy  may  also  be  explained  by  the 
acquisition  of  downregulation  of  the  genes  after  the  cancer 
cells  were  brought  into  the  in  vitro  culture. 

For  the  three  previously  identified  cancer  genes,  we 
have  mentioned  SMAD4  above.  Not  much  change  was 
observed  for  SMAD2  in  copy  number  and  gene  expres¬ 
sion.  For  BCL2,  the  expression  was  found  lower  in  both 
normal  and  cancerous  cells  in  comparison  with  normal 
breast  and  prostate  tissues.  Whether  cells  other  than  epi¬ 
thelial  cells  are  responsible  for  higher  expression  in  those 
tissues  or  loss  of  3-D  architecture  shut  down  transcription 
is  a  question  that  needs  to  be  answered.  In  either  case,  it 
is  unlikely  that  BCL2  plays  an  oncogenic  role  in  breast  or 
prostate  cancer,  as  opposed  to  follicular  lymphoma  in 
which  it  does. 

We  started  SM  PCR  and  SM  RT-PCR  methods  when 
the  DNA  microarray  hybridization  technique  was  still  in  its 
infancy.  Only  dozens  of  laboratories  were  successful  in 
producing  meaningful  results.  During  the  past  several 
years,  significant  progress  has  been  made.  Commercial 
DNA  microarrays  with  high  quality  have  become  available 
and  the  companies  that  perform  custom  hybridization  have 
appeared.  Concerns  were  raised  about  the  reliability  of 
DNA  microarray  results  because  the  results  varied  con- 
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siderably  among  different  platforms  and  different  labora¬ 
tories  [26].  To  improve  the  cross-platform  concordance  and 
to  minimize  the  variation  among  laboratories,  the  Micro- 
Array  Quality  Control  (MAQC)  project  was  launched  that 
recommended  the  use  of  standard  RNA  samples  for  com¬ 
parison.  Thanks  to  those  efforts,  DNA  microarray  tech¬ 
niques  have  been  maturing.  The  results  of  the  copy  num¬ 
ber  analysis  done  at  NimbleGen  were  satisfactory,  although 
the  turn-around  time  of  6  weeks  was  longer  than  we 
expected.  The  cost  was  reasonable.  However,  it  was  too 
expensive  for  us  to  order  hybridization  experiments  with 
all  17  specimens  analyzed  by  SM  PCR,  and  therefore,  we 
outsourced  only  the  3  hybridization  experiments  with  the 
cell  lines  that  were  shown  to  exhibit  significant  changes  in 
copy  number  by  SM  PCR.  The  number  of  detection  points 
differed  drastically  between  the  two  experiments.  Whereas 
a  little  more  than  5000  probes  were  examined  over  18q21- 
qter  with  6000  base  pairs  interval  by  the  arrayCGH  method 
using  the  NimbleGen  microarrays,  only  134  detection 
points  were  examined  by  the  SM  PCR  method.  In  spite  of 
this  huge  difference  in  number,  both  results  exhibited 
impressively  similar  patterns  of  copy  number  changes. 
DNA  microarray  hybridization  experiments  for  genome¬ 
wide  gene  expression  were  performed  at  the  DNA  Micro¬ 
array  Facility  at  the  institute.  The  number  of  the  probes  of 
the  Sentrix  Human-6  Expression  BeadChips  was  47  293 
and  smaller  than  the  number  of  the  probes  in  the  Nim¬ 
bleGen  microarrays  (385  000).  Although  each  probe  was 
represented  by  an  average  of  30  beads  in  the  BeadChips, 
the  confidence  level  may  have  been  higher  with  the  Nim¬ 
bleGen  microarrays  because  the  probes,  which  failed  to 
hybridize  with  reference  DNA,  were  excluded  from  con¬ 
sideration  in  the  copy  number  analysis  by  arrayCGH. 
Compared  with  the  hybridization  using  commercial  DNA 
microarrays,  establishment  of  the  SM  (RT-)PCR  system  is 
laborious  and  time-consuming.  Because  the  results  are 
obtained  in  multiple  sets,  there  is  a  variation  in  the  results 
among  different  sets.  However,  the  situation  may  not  be 
much  different  from  DNA  microarrays  where  the  amounts 
of  DNA  printed  may  vary  among  different  probes. 

In  the  present  study,  we  demonstrated  the  utility  of  the 
SM  PCR  and  SM  RT-PCR.  We  also  verified  that  the  results 
obtained  by  our  methods  were  comparable  in  quality  to  the 
results  obtained  by  DNA  microarray  hybridization  method, 
although  they  were  not  identical.  The  ability  to  identify  the 
amplified  fragments  by  their  size  may  counteract  the  defi¬ 
ciency  of  nonlinear  amplification  of  signal  by  PCR  and 
make  the  SM  PCR  and  SM  RT-PCR  methods  as  useful  as 
the  DNA  microarray  hybridization  method.  It  should  be 
reiterated  that  we  would  have  missed  the  opportunity  of 
identifying  both  the  CCBE1  and  TCF4  genes  as  promising 
candidates  if  we  had  not  performed  SM  RT-PCR.  The 
results  of  DNA  microarray  hybridization  using  Illumina’s 
BeadChips  could  be  interpreted  that  the  CCBE1  gene  was 
expressed  in  all  the  cell  lines  and  tissues  examined,  rather 
than  that  it  not  being  expressed  in  some  of  the  cancer  cell 


lines.  This  is  because  the  values  from  24  to  58  were  non¬ 
zero  and  negative  values  were  obtained  with  some  other 
genes  as  shown  in  Fig.  2.  On  the  other  hand,  the  value  of  2 
for  the  TCF4  gene  in  primary  culture  of  mammary  epi¬ 
thelial  cells  could  be  interpreted  as  no  expression  and 
excluded.  The  results  of  SM  RT-PCR  showed  high  and 
near  linear  correlation  with  the  results  of  real-time  qRT- 
PCR,  which  demonstrated  that  the  SM  RT-PCR  results  do 
not  require  confirmation  by  real-time  qRT-PCR.  This  is 
opposite  to  DNA  microarray  hybridization  where  the 
results  always  need  to  be  confirmed. 
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ABSTRACT 


Losses  of  the  p-arm  of  chromosome  8  are  frequently  observed  in  breast,  prostate,  and  other  types 
of  cancers.  Using  the  Systematic  Multiplex  RT-PCR  (SM  RT-PCR)  method  that  we  developed, 
we  examined  the  expression  of  238  genes  located  on  the  p-arm  of  chromosome  8  in  five  breast 
and  three  prostate  human  cancer  cell  lines.  We  observed  frequent  decreases  in  expression  of  two 
dozens  of  genes  and  increases  in  expression  of  several  genes  on  this  chromosomal  arm.  These 
changes  in  gene  expression  of  the  cell  lines  were  later  confirmed  by  real-time  qRT-PCR. 
Additionally  and  more  importantly,  we  found  that  some  of  the  changes  were  also  observed  in  the 
majority  of  breast  cancer  clinical  cases  that  we  examined.  These  included  down-regulation  of  the 
MYOM2,  NP_859074,  NP_001034551,  NRG1,  PHYIP  (PHYHIP),  Q7Z2R7,  SFRP1,  and  SOX7 
genes  and  up-regulation  of  the  ESC02,  NP_1 15712  (GINS4),  Q6P464,  and  TOPK  (PBK)  genes. 


Keywords: 

Systematic  Multiplex  RT-PCR  (SM  RT-PCR),  real-time  qRT-PCR,  DNA  microarray 
hybridization,  chromosome  8p,  chromosomal  scanning,  gene  expression,  breast  cancer,  prostate 
cancer 
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1.  INTRODUCTION 


The  activation  of  oncogenes  and  the  inactivation  of  tumor  suppressor  genes  both  play  important 
roles  in  carcinogenesis.  Most  changes  in  these  activating/inactivating  processes  occur  in  copy 
number,  gene  expression,  or  nucleotide/amino  acid  sequences.  Therefore,  the  determination  of 
copy  number  and  gene  expression,  together  with  nucleotide  sequencing,  assists  in  the 
identification  of  oncogenes  and  tumor  suppressor  genes.  For  the  activation  of  an  oncogene,  a 
monoallelic  dominant  change  is  often  sufficient.  Examples  of  monoallelic  activation  include 
transcriptional  activation  of  the  BCL2  (B-cell  leukemia  2)  gene  by  the  t(  14;  1 8)  translocation  that 
places  this  gene  next  to  an  active  promoter  in  follicular  lymphoma  (1),  MYCN  gene 
amplification  that  is  concomitant  with  an  increased  gene  expression  in  neuroblastoma  (2),  and 
activating  mutations  in  KRAS2  gene  in  cancers  of  lung,  colon,  pancreas,  and  others  (3-5). 
However,  for  the  inactivation  of  a  tumor  suppressor  gene,  haplo-insufficiency  is  rare  and  the 
disruption  of  both  alleles  (biallelic  inactivation)  is  usually  necessary. 

Quantitative  analysis  of  copy  number  progressed  when  the  comparative  genomic 
hybridization  (CGH)  method  was  invented  based  on  the  two-color  fluorescence  in  situ 
hybridization  (FISH)  (6).  In  CGH,  genomic  DNA  from  a  test  sample  is  labeled  with  one 
fluorescent  color,  a  reference  genomic  DNA  is  labeled  with  another  color,  and  they  are  mixed 
and  hybridized  with  metaphase  chromosomal  spreads  of  normal  cells.  The  ratio  of  the  two 
fluorescence  intensities,  rather  than  the  absolute  intensity,  is  used  to  monitor  the  difference  in 
copy  number.  Using  this  technique,  many  maps  of  chromosomal  alterations  in  cancer  were 
produced.  It  was  shown  that  there  was  a  significant  degree  of  heterogeneity  among  a  variety  of 
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tumors,  as  well  as  within  the  same  type  of  tumor.  Chromosomal  gains  and  losses,  which  are 
indicative  of  the  presence  of  oncogenes  and  tumor  suppressor  genes,  respectively,  were  located 
on  the  chromosomes.  For  example,  frequent  gains  in  chromosomal  arms  lq,  3q,  8q,  16p,  17q, 
20q  and  losses  in  lp,  6q,  8p,  13q,  16q,  17p,  18q,  22q,  and  X  were  reported  in  breast  cancer  (7,  8). 
Chromosomal  losses  were  more  frequent  than  gains  in  prostate  cancer  and  observed  with  the 
chromosomal  arms  lp,  5q,  6q,  8p,  lOq,  13q,  16q,  and  18q  (9,  10).  The  use  of  BAC  clone  DNA 
microarrays  (11,  12)  and  cDNA  fragment  microarrays  (13,  14)  for  the  CGH  karyotyping  analysis 
of  copy  number  has  produced  a  more  powerful  and  high-resolution  arrayCGH  method. 

Several  tumor  suppressor  genes  have  been  identified  in  the  chromosomal  regions  of 
losses.  These  include  CDH1  on  16q22  (15)  and  PTEN  on  10q23  (16).  The  inactivation  of  those 
genes  was  considered  to  be  the  selective  force  that  resulted  in  the  loss  of  the  corresponding 
chromosomal  regions  because  of  the  frequent  abnormalities  and  functional  failure  of  the  proteins 
encoded  by  those  genes.  Aiming  to  identify  the  novel  genes  with  tumor  suppressor  activity,  we 
started  gene  expression  analysis.  We  chose  the  p-arm  of  chromosome  8  because  this  arm  is  one 
of  the  chromosomal  arms  most  frequently  lost  in  breast  and  prostate  cancers,  strongly  suggesting 
that  the  region  may  harbor  tumor  suppressor  genes  involved  in  the  pathogenesis  of  those  cancers 
(8,  17).  Although  breast  and  prostate  cancers  both  progress  from  an  early,  sex  hormone- 
dependent,  organ-confined  disease  to  a  highly  invasive,  hormone-independent,  metastatic 
disease,  they  arise  in  two  different  organs.  By  pursuing  tumor  suppressor  genes  common  to  these 
cancers  of  two  different  organs,  we  speculated  that  the  exclusion  of  inappropriate  genes  will  be 
easier,  whose  expression  is  specific  to  either  of  mammary  or  prostate  normal  epithelial  cells. 
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Here,  we  report  the  results  obtained  by  analysis  of  gene  expression  on  the  chromosomal  arm  by 
the  moderately  high-throughput  Systematic  Multiplex  RT-PCR  (SM  RT-PCR)  method  (18-21). 
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2.  MATERIALS  &  METHODS 

2.1  SM  RT-PCR  experiments  to  measure  expression  of  the  genes  on  the  p-arm  of 
chromosome  8 

The  following  RNA  samples  were  used  for  the  gene  expression  analysis:  a  normal  and  a  primary 
tumor  tissue  of  breast  from  a  patient  with  invasive  ductal  carcinoma,  a  normal  and  a  primary 
carcinoma  tissue  of  prostate  from  a  patient  with  prostate  cancer,  another  normal  prostate  tissue, 
and  a  hyperplastic  prostate  tissue,  5  mammary  (BT-20,  MCF7,  MDA-MB-231,  MDA-MB-468, 
and  T-47D)  and  3  prostate  (DU145,  LNCaP,  and  PC3)  cancer  cell  lines,  and  primary  cultures  of 
normal  mammary  and  prostate  epithelial  cells.  cDNA  was  prepared  by  reverse-transcription  of 
total  RNA  using  oligo  dT  as  a  primer  and  the  Advantage  RT-for-PCR  Kit  (BD  Biosciences- 
Clontech). 

We  followed  the  SM  RT-PCR  experimental  protocols  described  previously  (18-21). 
Briefly,  the  genes  on  8p  were  categorized  into  groups  of  ~10  genes,  and  PCR  primers  were 
designed  to  amplify  different  sizes  of  DNA  fragments  from  single  exons  of  the  genes  in  a  group. 
After  the  multiplex  reactions  using  genomic  DNA  from  normal  human  tissues  as  a  control,  the 
concentrations  of  the  primers  were  adjusted  to  produce  bands  of  similar  intensities.  Once  the 
conditions  were  elaborated,  cDNA  samples  from  the  human  cells  and  tissues  were  then  used  as 
templates  to  examine  gene  expression.  Small  aliquots  of  the  SM  RT-PCR  reaction  products  were 
loaded  onto  an  8%  polyacrylamide  gel  and  electrophoresed.  The  gels  were  stained  with  ethidium 
bromide,  the  gel  pictures  were  taken,  and  the  images  were  saved  in  TIFF  format.  The  band 
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intensity  was  measured  using  the  ImageQuant  software  (Amersham  Biosciences)  and  normalized 
by  adjusting  the  average  band  intensities  of  individual  gels. 

2.2  DNA  microarray  hybridization  experiments  to  determine  gene  expression 

For  comparison,  the  DNA  microarray  hybridization  experiments  were  performed.  Illumina’s 
Sentrix  Human-6  Expression  BeadChips,  which  contained  probes  from  the  entire  23,000  RefSeq 
collection  and  an  additional  23,000  other  expressed  sequences,  were  used.  The  following  RNA 
samples  were  analyzed:  a  normal  breast  tissue,  a  normal  prostate  tissue,  primary  cultures  of 
normal  mammary  and  prostate  epithelial  cells,  5  mammary  (BT-20,  MCF7,  MDA-MB-231, 
MDA-MB-468,  and  T-47D)  and  3  prostate  (DU145,  LNCaP,  and  PC3)  cancer  cell  lines.  The 
same  preparations  of  RNA  that  were  used  in  SM  RT-PCR  were  used  in  the  microarray 
hybridization  experiments.  Following  Illumina’s  protocol,  biotinylated  cRNA  was  prepared  and 
hybridized  with  the  BeadChips.  After  washing,  the  BeadChips  were  treated  with  Cy3-labelled 
strep tavidin,  washed,  dried,  and  scanned  for  fluorescence  intensity  with  Illumina’s  BeadStation 
500.  Raw  data  were  generated  and  normalized  using  the  Beadscan  3.0  software.  The  gene 
expression  data  for  the  genes  on  the  p-arm  of  chromosome  8  were  extracted. 

2.3  Real-time  qRT-PCR  experiments  to  measure  gene  expression 

Real-time  qRT-PCR  of  the  selected  genes  was  performed  using  the  same  set  of  cDNA  from  the 
cells  and  tissues  that  were  analyzed  by  the  DNA  microarray  hybridization  experiments,  together 
with  the  genomic  DNA  control.  The  same  preparations  of  cDNA  that  were  used  in  the  SM  RT- 
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PCR  were  used  in  the  real-time  qRT-PCR  experiments.  In  addition  to  this  subset  of  the  cDNA 
samples,  additional  cDNA  samples  prepared  from  12  matched  pairs  of  normal  and  tumor  breast 
tissues  were  also  analyzed  by  real-time  qRT-PCR.  The  same  primer  pairs  that  were  used  in  the 
SM  RT-PCR  experiments  were  also  used  in  the  real-time  qRT-PCR  experiments.  The  reagent 
from  the  Power  SYBR  Green  PCR  Master  Mix  (Applied  Biosystems)  was  used  and  the  yields  of 
the  PCR  products  were  monitored  using  the  Mx3000p  system  (Stratagene)  under  the  default 
conditions,  with  the  exception  that  the  annealing  temperature  was  raised  to  60°  C  instead  of  55° 
C.  Data  were  analyzed  using  the  MxPro  software,  and  the  Ct  values  were  obtained  for  the 
individual  reactions.  The  Ct  values  of  the  ubiquitously  expressed  ASAH1  gene,  which  is  located 
on  8p,  were  subtracted  from  those  values  and  normalized. 
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3.  RESULTS 


3.1  SM  RT-PCR  analyses  of  the  genes  on  the  8p  chromosomal  arm 

We  established  the  SM  RT-PCR  system  consisting  of  254  genes.  They  were  categorized  into  26 
groups.  The  list  of  the  genes  is  shown  in  Table  1,  together  with  the  nucleotide  sequences  and 
concentrations  of  the  primers  used  in  this  study  and  the  sizes  of  the  amplified  DNA  fragments. 
We  examined  the  expression  of  those  254  genes  in  normal  and  cancerous  breast  and  prostate 
cells  and  tissues.  Results  are  shown  in  Figure  1.  Because  the  PCR  conditions  were  elaborated  so 
that  small  amounts  of  genomic  DNA  would  produce  bands,  the  absence  of  at  least  one  band  was 
considered  to  confirm  the  absence  of  contaminating  genomic  DNA  in  the  cDNA  specimens.  We 
found  that  approximately  42%  of  the  genes  were  abundantly  expressed  in  ah  of  the  cells  and 
tissues  that  were  examined.  We  also  observed  that  approximately  30  genes  were  not  expressed  or 
rarely  expressed  in  either  normal  or  cancerous  breast/prostate  cells/tissues.  The  remaining  genes 
were  differentially  expressed  in  some  of  the  cDNA  samples  examined. 

Among  them,  we  identified  a  dozen  genes  that  exhibited  unidirectional  changes  in  gene 
expression  in  both  breast  and  prostate  cancer  cell  lines.  These  include  the  GON1  (GNRH1)  (set 
16),  NRG1  (set  18),  PIWL2  (PIWIL2)  (set  13),  and  Q7Z2R7  (sets  16  &18)  genes  that  were 
found  down-regulated  in  ah  5  breast  and  3  prostate  cancer  cell  lines  and  the  ESC02  (set  20), 
GSHR  (GSR)  (set  26),  NP_1 15712  (GINS4)  (set  23),  Q6P464  (CDCA2)  (set  16),  TOPK  (PBK) 
(set  17)  genes  that  were  found  up-regulated  in  ah  of  those  cell  lines,  compared  to  the  expression 
in  normal  epithelial  cells.  We  also  identified  additional  genes  that  exhibited  changes  in  a 
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majority  of  either  breast,  prostate,  or  both  cancer  cell  lines.  Those  include  CH012  (C8orfl2)  (set 

4) ,  CHOI 3  (set  5),  DEF1  (DEFA1)  (set  1),  EGR3  (set  20),  ENST357748  (ENST000000357748) 
(set  11),  FBX25  (FBX025)  (set  1),  MYOM2  (set  2),  NP_065895  (set  7),  NP_859074  (set  19), 
NP_00 1034551  (NP_1 034551)  (set  6),  NPM2  (set  19),  PHYIP  (PHYHIP)  (set  14),  Q8NEP6  (set 

5) ,  Q96KT8  (set  6),  SFRP1  (set  13),  SOX7  (set  1),  TP  A  (set  24),  TR10D  (TNFRSF10D)  (set 
15),  and  XR_017857  (C8orf48)  (set  6)  genes.  We  measured  the  intensity  of  the  SM  RT-PCR 
bands  for  quantification.  The  results  were  then  aligned  by  the  chromosomal  locations  of  the 
genes  and  are  shown  in  the  left  column  of  Figure  2.  To  facilitate  the  comparison,  the  intensity 
was  shown  in  gray-scale  from  black  (weakest)  to  white  (strongest).  Out  of  the  254  genes,  238 
genes  were  mapped  on  the  p-arm  of  chromosome  8  in  the  newest  version  of  Ensembl  (version 
43)  and  the  results  of  those  238  genes  are  shown. 

3.2  DNA  microarray  hybridization  analysis  of  gene  expression  of  the  genes  on  8p 

In  order  to  compare  the  results  from  the  SM  RT-PCR  experiments  with  the  results  obtained  by  an 
established  method  of  DNA  microarray  hybridization,  we  performed  the  genome-wide  gene 
expression  analysis  using  Illumina’s  BeadChips.  Data  for  the  genes  on  8p  were  extracted  and 
aligned  based  on  the  chromosomal  locations  of  the  genes.  Because  the  PCR  primers  for  SM  RT- 
PCR  were  designed  based  on  the  sequences  that  were  not  alternatively  spliced,  only  the  data 
using  the  “singular”  or  “all”  probes  that  detect  all  the  messages  from  the  corresponding  genes 
were  extracted  from  the  Illumina  data.  The  average  fluorescence  signal  intensity  of  >30  beads 
was  extracted  and  gray-scaled,  and  are  shown,  side-by-side  with  the  data  from  the  SM  RT-PCR 
experiments,  in  the  right  column  of  Figure  2.  The  expression  data  were  obtained  for  the  230 


10 


F.  Yamamoto  and  M.  Yamamoto 


genes  on  the  p-arm  of  chromosome  8,  195  of  which  overlapped  with  the  genes  whose  expression 
was  determined  by  SM  RT-PCR.  Compared  to  SM  RT-PCR,  the  results  from  the  DNA 
microarray  hybridization  experiments  exhibited  a  wider  range  of  intensity  as  anticipated.  There 
were  25  genes  whose  messages  were  not  detected  (fluorescence  intensity  below  10  in  all  the 
specimens).  The  number  increased  to  34  when  the  cut-off  fluorescence  intensity  was  set  at  15. 

3.3  Gene  expression  analysis  of  the  selected  genes  by  real-time  qRT-PCR 

We  performed  real-time  qRT-PCR  to  re-examine  the  expression  of  the  genes  that  exhibited 
consistent  changes  in  expression  by  the  SM  RT-PCR  method.  The  same  set  of  cells  and  tissues 
that  were  analyzed  by  DNA  microarray  hybridization  were  examined  for  the  expression  of  the 
CH012,  CH013,  DEF1,  EGR3,  ESC02,  FBX25,  GON1,  GSHR,  MYOM2,  NP_065895, 
NP_1 15712,  NP_859074,  NP_001034551,  NPM2,  NRG1,  PHYIP,  PIWL2,  Q6P464,  Q7Z2R7, 
Q8NEP6,  Q96KT8,  SFRP1,  SOX7,  TOPK,  TP  A,  TR10D,  and  XR_0 17857  genes.  Because  of 
high  expression  of  the  messages  in  all  the  samples  in  both  the  SM  RT-PCR  and  the  DNA 
microarray  hybridization  experiments,  we  selected,  as  a  control  gene,  the  ASAH1  gene  (22). 
This  gene  encodes  N-acylsphingosine  amidohydrolase,  also  called  acid  ceramidase  (AC;  EC 
3.5.1.23),  which  catalyzes  the  synthesis  and  degradation  of  ceramide.  The  Ct  values  obtained  by 
real-time  qRT-PCR  were  plotted  on  the  X-axis.  The  log?  values  calculated  of  the  measured  band 
intensity  and  fluorescence  intensity  of  the  genes  from  the  corresponding  cDNA  samples  were 
plotted  on  the  Y-axis.  Results  from  all  the  selected  important  genes  are  shown  in  Figure  3.  The 
results  are  shown  on  the  same  scale,  and  the  result  of  the  ASAH1  gene  was  also  enlarged  and 
shown  next  to  the  original  figure  on  the  top  row.  A  higher  degree  of  linearity  was  observed 
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between  the  results  from  the  SM  RT-PCR  experiments  and  the  results  from  the  real-time  qRT- 
PCR  experiments  than  between  the  DNA  microarray  hybridization  experiments  and  the  real-time 
qRT-PCR  experiments.  This  is  reasonable  considering  that  both  the  SM  RT-PCR  and  real-time 
qRT-PCR  are  PCR-based  and  the  same  primers  were  used  in  those  experiments.  The  differences 
observed  by  SM  RT-PCR  were  confirmed  to  be  real  by  real-time  qRT-PCR,  although  some  of 
them  were  not  observed  by  DNA  microarray  hybridization. 

We  next  examined  whether  the  same  differences  in  the  expression  level  that  were 
observed  in  the  breast  and  prostate  cancer  cell  lines  were  also  present  in  the  clinical  specimens  of 
cancer.  Because  we  had  matched  normal  and  tumor  pairs  of  breast  tissues  from  a  dozen  breast 
cancer  patients,  we  analyzed  the  expression  of  the  selected  genes  in  breast  cancer  by  real-time 
qRT-PCR.  To  normalize  the  Ct  values  of  the  individual  specimens,  we  subtracted  the  Ct  values 
of  the  ubiquitously  expressed  ASAH1  gene  from  those  values.  Down-regulation  was  observed 
with  the  following  genes  in  a  majority  of  12  breast  cancer  cases:  MYOM2,  PHYIP,  SOX7  (10 
cases),  DEF1,  FBX25,  NP_001034551  (9  cases),  CH012,  GON1,  NP_859074,  NRG1,  PIWL2, 
Q7Z2R7,  SFRP1  (8  cases),  Q8NEP6,  Q96KT8,  and  XR_0 17857  (7  cases).  Similarly,  up- 
regulation  was  observed  in  a  majority  of  breast  cancer  cases  with  the  following  genes:  TOPK  (10 
cases),  Q6P464  (9  cases),  ESC02,  NP_1 15712  (8  cases),  GSHR  (7  cases).  For  the  remaining 
genes,  down-regulation  was  observed  in  4  (NP_065895),  5  (TP A),  or  6  cases  (CHOI 3,  EGR3, 
NPM2,  TR10D).  Many  of  the  important  results  are  shown  in  Figure  4,  by  plotting  the  subtractive 
Ct  values  of  the  tumor  tissues  on  the  Y-axis  and  those  values  of  the  normal  adjacent  tissues  on 
the  X-axis. 
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4.  DISCUSSION 


Previously  we  established  the  SM  RT-PCR  systems  of  families  of  glycosyltransferases  (18), 
HOX  homeoproteins  (19),  and  integrins  (20).  We  also  gradually  increased  the  size  of  coverage 
from  a  few  dozen  genes  (21)  to  more  than  a  hundred  genes  in  a  few  cytobands  (23).  Here  we 
attempted  to  establish  the  SM  RT-PCR  system  of  more  than  200  genes  on  the  entire  arm  of  a 
chromosome.  Excluding  the  DNA  microarray  hybridization  approach,  this  SM  RT-PCR  analysis 
is  one  of  the  largest  attempts  to  understand  the  expression  of  the  genes  on  a  chromosomal  arm¬ 
wide  scale.  We  aimed  to  incorporate  as  many  genes  as  possible  into  the  system.  Because  the 
defensin  genes  were  highly  homologous  one  another  and  possessed  short  coding  sequences,  we 
were  unable  to  design  primers  for  several  members  of  the  defensin  gene  family  that  selectively 
amplified  single  species  of  the  gene  members.  Nonetheless,  we  included  254  genes  in  26 
multiplex  reactions  as  shown  in  the  primer  list  in  Table  1  and  the  genomic  DNA  lanes  (G)  in 
Figure  1.  However,  the  Ensembl  database  was  not  finished  at  the  time  that  we  retrieved  the  gene 
and  sequence  information.  When  we  aligned  our  results  in  the  most  recent  version,  43,  we  found 
that  1 1  genes  that  were  previously  mapped  in  the  region  did  not  exist  any  longer.  Furthermore,  a 
few  dozen  additional  genes  that  were  not  previously  mapped  have  been  added.  These  include 
novel  protein-coding  genes,  pseudogenes,  miRNA  genes,  snRNA  genes,  and  snoRNA  genes. 
Because  of  this  addition,  Figure  2  has  many  open  spaces,  for  which  no  expression  data  were 
available.  195  of  the  238  genes  whose  expression  was  determined  by  SM  RT-PCR  overlapped 
with  the  genes  whose  expression  was  determined  by  the  DNA  microarray  hybridization 
experiments.  A  generally  good  correlation  was  observed  in  the  results  between  the  SM  RT-PCR 
and  DNA  microarray  hybridization  experiments,  except  that  the  expression  of  approximately  40 
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genes  were  detected  only  by  SM  RT-PCR.  We  think  that  the  probes  of  those  genes  used  in  the 
DNA  microarray  hybridization  were  either  inappropriate  or  not  functioning  as  expected.  The 
results  shown  in  Figure  3  also  illustrate  this  problem.  To  calculate  the  log2  values  of  fluorescence 
intensity  from  the  DNA  microarray  hybridization  results,  we  used  0. 1  for  the  values  below  this 
number.  Still,  when  the  fluorescence  signal  was  weak,  as  in  the  cases  of  NP_00 1034551  and 
PHYIP,  no  correlation  was  observed  between  real-time  qRT-PCR/SM  RT-PCR  and  DNA 
microarray  hybridization.  However,  when  fluorescence  signal  was  strong,  both  SM  RT-PCR  and 
DNA  microarray  hybridization  exhibited  linear  correlation  with  real-time  qRT-PCR  as  shown 
with  the  GSHR,  SFRP1,  and  TOPK  genes. 

In  addition  to  the  cell  lines,  we  also  examined  the  expression  of  the  selected  genes  in  the 
clinical  specimens  of  breast  cancer.  As  opposed  to  the  in  vitro  cultured  cancer  cell  lines  that 
consist  of  a  relatively  uniform  population  of  cells,  tissues  are  made  of  several  different  types  of 
cells  and  their  ratios  vary  among  different  specimens.  Therefore,  measurement  of  the  Ct  values 
without  standardization  was  not  informative.  We  used  the  expression  of  highly  and  ubiquitously 
expressed  ASAH1  gene  as  a  standard.  By  subtracting  the  Ct  values  of  the  ASAH1  gene,  we 
compared  the  relative  ratios  of  the  gene  messages  among  different  specimens.  Rather  than 
comparing  the  normal  and  cancer  tissue  specimens  as  two  groups,  we  plotted  the  results  from  the 
individual  pairs  of  cancer  tissue  specimens  and  their  corresponding  normal  adjacent  tissue 
specimens  on  the  Y-  and  X-axes,  respectively.  The  tendencies  of  up-  and  down-regulation  in 
gene  expression  were  easily  confirmed  with  most  of  the  genes  examined.  For  several  genes,  the 
tendencies  were  not  clear  with  the  breast  clinical  specimens.  Several  potential  reasons  can  be 
speculated.  One  possibility  is  that  the  cancer  cell  lines  may  have  acquired  down-regulation  in 
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expression  of  those  genes  after  they  were  brought  into  in  vitro  culture.  Another  possibility  is  that 
cells,  other  than  cancer  cells,  that  were  present  in  the  tumor  tissues  express  these  genes  and 
losses/decreases  in  cancer  cells  may  have  been  masked. 

Among  the  genes  that  exhibited  a  matched  tendency  of  up-regulation  in  the  breast  tumor 
tissues  and  breast  cancer  cell  line  cells,  the  tendency  was  striking  with  the  ESC02,  TOPK, 
NP_1 15712,  and  Q6P464  genes.  Because  the  ESC02  gene  is  required  for  the  establishment  of 
sister  chromatid  cohesion  during  S  phase  of  cell  cycle  (24),  the  TOPK  gene  encodes 
serine/threonine  kinase  that  binds  to  the  PDZ2  domain  of  Drosophila  Discs-large  (Dig)  tumor 
suppressor  protein  that  regulates  the  cell  cycle  and/or  cellular  proliferation  (25),  the  NP_1 15712 
protein  is  a  component  of  the  GINS  complex  that  is  essential  for  the  initiation  of  DNA 
replication  (26),  and  the  Q6P464  gene  is  associated  with  cell  division  cycle,  up-regulation  of 
these  genes  in  tumors  and  cancer  cell  lines  may  simply  be  a  reflection  of  a  higher  number  of 
dividing  cells  in  those  specimens.  These  four  genes  were  elevated  in  gene  expression  in  all  5 
breast  and  3  prostate  cancer  cell  lines  that  were  examined,  suggesting  that  this  is  a  likely 
possibility. 

Among  the  genes  that  were  down-regulated  in  a  majority  of  breast  cancer  cases,  4  genes, 
GON1,  NRG1,  PIWL2,  and  Q7Z2R7,  were  also  down-regulated  in  all  5  breast  and  3  prostate 
cancer  cell  lines  that  were  examined.  Additionally,  4  genes  (NP_859074,  NP_001034551, 
PHYIP,  and  SOX7)  exhibited  down-regulation  in  all  the  5  breast  cancer  cell  lines  examined  and 
1  gene  (SFRP1)  exhibited  down-regulation  in  a  majority  of  breast  and  prostate  cancer  cell  lines. 
Two  genes,  CH012  and  XR  017857,  exhibited  down-regulation  in  a  minority  of  cell  lines,  and  5 
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genes  (DEF1,  FBX25,  MYOM2,  Q8NEP6,  and  Q96KT8)  showed  decreased  expression  only  in 
the  breast  cancer  cell  lines.  Among  these  down-regulated  genes,  statistically  significant 
tendencies  were  observed  with  the  MYOM2,  NP_859074,  NP_001034551,  NRG1,  PHYIP, 
Q7Z2R7,  SFRP1,  and  SOX7  genes,  shown  in  Figure  4.  The  tumor-suppressing  role  has  been 
well  established  of  the  NRG1  gene.  The  gene  encodes  neuregulin  1  (heregulin)  that  interacts 
with  the  NEU/ERBB2  receptor  tyrosine  kinase  to  increase  its  phosphorylation  on  tyrosine 
residues  (27).  The  purified  protein  induces  phenotypic  differentiation  of  breast  and  prostate 
cancer  cells  and  inhibits  cell  growth  (28,  29).  The  SFRP1  and  SOX7  genes  play  a  similar  role  in 
carcinogenesis  by  repressing  the  Wnt  signaling  inside  the  cell.  The  SFRP1  gene  encodes  a 
secreted  apoptosis-related  protein  that  interferes  with  the  Wnt-frizzled  signaling  pathway  (30, 
31).  The  potential  role  of  the  SFRP1  gene  in  tumor  suppression  of  breasts  and  prostates  was 
previously  suggested  (32,  33).  The  SOX7  gene  encodes  a  transcription  factor  that  possesses  a 
functional  transactivation  domain  in  the  C-terminus  and  significantly  reduces  Wnt/beta-catenin- 
stimulated  transcription  (34).  The  identification  of  three  genes,  NRG1,  SFRP1,  and  SOX7,  which 
are  known  to  be  involved  in  carcinogenesis  among  the  candidates,  indicates  that  the  approach  is 
working  as  expected. 

In  addition  to  those  cancer-related  genes,  we  also  identified  MYOM2,  PHYIP,  and  three 
poorly  characterized  candidate  genes.  The  MYOM2  and  PHYIP  genes  encode  myomesin  2,  an 
M-band  protein  of  sarcomeres  (35),  and  phytanoyl-CoA  hydroxylase-interacting  protein  (36), 
respectively.  The  NP_859074  gene  predicts  to  encode  a  protein  with  the  EF-hand  domain  (37), 
and  the  NP_00 10345 51  and  Q7Z2R7  genes  predict  proteins  of  121  and  83  amino  acid  residues, 
respectively.  Little  else  is  known  of  those  genes,  however,  the  Q7Z2R7  gene  is  separated  from 
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the  NRG1  gene  by  only  1,577  bp  and  that  the  expression  profiles  of  those  genes  were  similar  in 
the  cells  and  tissues  that  were  examined.  Because  the  orientations  of  these  genes  are  the  same, 
there  is  a  possibility  that  the  Q7Z2R7  sequence  may  be  transcribed  run-off  in  the  3’  untranslated 
region  of  the  NRG1  gene  messages  rather  than  transcribed  independently  from  its  own  promoter. 
Further  studies  will  be  needed  before  concluding  that  these  candidates  are  genes  with  tumor 
suppressor  activity.  In  summary,  we  have  shown  that  the  SM  RT-PCR  approach  is  successful  in 
the  identification  of  genes  with  altered  expression  through  scanning  of  the  genes  at  the 
subchromosomal  level.  It  should  be  emphasized  that  by  performing  multiplex  reactions  of  10 
genes  on  average,  the  number  of  reactions  was  reduced  by  10  times  in  the  SM  RT-PCR 
experiments,  as  compared  with  real-time  qRT-PCR  of  individual  genes.  Together  with  more 
flexibility  in  designing  the  SM  RT-PCR  system  than  DNA  microarrays  and  using  pre-confirmed 
primers,  this  advantage  may  allow  the  SM  RT-PCR  find  its  niche  between  the  discovery  method 
of  high-throughput  DNA  microarray  hybridization  and  quantitative  real-time  qRT-PCR. 
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FIGURE  LEGENDS 


Figure  1.  SM  RT-PCR  results  of  breast  and  prostate  cells  and  tissues 

The  results  of  the  SM  RT-PCR  experiments  are  shown.  There  are  a  total  of  26  sets.  SM  RT-PCR 
was  performed  to  examine  gene  expression  changes  in  breast  and  prostate  cancer  cells  and 
tissues.  The  sources  of  cDNA  are  abbreviated:  a  normal  sample  (NB)  and  primary  tumor  (TB)  of 
breast  tissue  from  an  individual;  a  normal  sample  (NP),  and  primary  tumor  tissues  (TP)  of 
prostate  from  an  individual;  a  normal  prostate  tissue  (NP)  from  a  third  individual;  a  hyperplastic 
prostate  tissue  (HyP)  from  a  fourth  individual;  primary  cultures  of  normal  mammary  (MP)  and 
prostate  (PP)  epithelial  cells;  and  MCF-7  (MCF),  MDA-MB-468  (468),  MDA-MB-231  (231), 
BT-20  (BT),  T-47D  (T47),  PC3  (PC),  DU  145  (DU),  and  LNCaP  (LN)  cancer  cell  line  cells.  The 
locations  of  the  DNA  fragments  amplified  from  the  individual  genes  are  also  shown  at  the  left 
side  of  the  gel  pictures.  The  symbol  M  denotes  DNA  fragment  size  markers,  and  the  symbol  G 
shows  the  results  of  genomic  DNA  control. 

Figure  2.  Intensities  of  the  bands  amplified  by  SM  RT-PCR  and  the  intensities  of 
fluorescence  detected  after  DNA  microarray  hybridization  of  the  genes  on  the  p-arm  of 
chromosome  8 

Data  that  were  obtained  by  the  SM  RT-PCR  experiments  that  are  shown  in  Figure  1  were  used  to 
prepare  this  table  by  the  densitometry  measurement  of  band  intensity.  In  order  to  normalize  the 
values,  the  average  band  intensities  of  individual  gels  were  adjusted.  The  partial  results  of  the 
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SM  RT-PCR  experiments  are  shown  in  the  left  column.  The  values  of  the  band  intensities  of  the 
PCR-amplified  fragments  were  aligned  by  their  chromosomal  locations,  and  are  shown  in  gray 
scale,  with  white  as  the  strongest  and  black  as  the  weakest.  Data  on  fluorescence  signal  intensity 
were  extracted  for  the  genes  on  8p  from  the  DNA  microarray  hybridization  results,  normalized, 
and  aligned.  Results  are  shown  in  the  right  column.  The  gene  names,  cytobands,  the  starts  and 
ends  of  the  gene  locations,  and  the  primer  sets,  are  also  shown. 

Figure  3.  Correlation  between  the  band  intensity  obtained  from  the  SM  RT-PCR  or 
fluorescence  intensity  obtained  from  DNA  microarray  hybridization  and  the  Ct  values 
obtained  from  the  real-time  qRT-PCR  experiments 

The  log  2  values  of  the  band  intensity  and  fluorescence  intensity  were  calculated  and  plotted 
along  the  Y-axis  with  black  diamonds  and  gray  squares,  respectively,  against  the  Ct  values  on 
the  X-axis.  The  ASAH1  gene  was  used  as  a  control,  because  this  gene  was  ubiquitously 
expressed  in  large  quantity  in  all  the  cells  and  tissues  that  were  examined  in  both  the  SM  RT- 
PCR  and  the  DNA  microarray  hybridization  experiments.  Negative  and  zero  values  obtained  by 
microarray  hybridization  experiments  were  assigned  the  value  of  0.1  for  these  graphs.  The 
portion  of  the  ASAH1  results  was  enlarged  and  is  also  shown  on  the  top  row. 

Figure  4.  The  expression  of  the  selected  genes  in  matched  normal  and  cancer  breast  tissues 

The  gene  expression  was  determined  for  the  selected  genes  by  real-time  qRT-PCR  using  cDNA 
prepared  from  12  matched  normal  and  cancer  breast  tissues.  The  results  from  the  genes  that 
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showed  consistent  and  meaningful  changes  in  gene  expression  in  both  the  cell  lines  and  clinical 
specimens  are  shown.  In  order  to  normalize  the  expression  data,  we  used  the  expression  of  the 
ASAH1  gene  as  a  control.  The  subtractive  Ct  values  (minus  Ct  ASAH1)  of  normal  tissues  are 
mapped  on  the  X-axis,  whereas  those  of  the  corresponding  tumor  tissues  from  the  same 
individuals  are  on  the  Y-axis.  The  line  y=x  is  also  shown.  The  dots  above  the  line  indicate  down- 
regulation  in  tumor,  whereas  dots  below  indicate  up-regulation. 
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Table  1.  Primers  used  in  the  study 


Set  Abbreviated  Frag.  Primer  1 


Primer  2 


Final 


1 


2 


3 


4 


Gene  Namet 

Size 

(bp) 

ARHGA 

144 

SOX7 

108 

FBX25 

95 

NP  078883 

86 

NP  079043 

81 

CSMD1 

77 

MTMR9 

72 

DEF1 

68 

TNKS1 

120 

NP  919260 

109 

MYOM2 

101 

PINX1 

94 

DLGP2 

CO 

CO 

BLK 

82 

D103A 

78 

MCPH1 

74 

NP  940866 

155 

C8orf 54 

119 

Q96LV3 

103 

CH014 

94 

ENST297485 

87 

NP_1027009 

82 

Q8NF75 

77 

Q8IWN7-2 

170 

Sequence  Sequence  Concn 

(nM) 


TGAGAAGCAAAGCACGCCGGGC  TCTGCCAGACCATGACGGTCG  94 
C AT  GGAT  C  GC AAT  GAAT  T  C  GAC  C A  GGTGTCACCTGGGAGACCGGAAC  94 
CTCAGGACACCCCTGCACGGC  GCAGCCCTTAAAACTTGAAGAGGTCGAT  141 
TCGACATCAGCTTGCCCGAGAA  CCAGTACGTCTGTCCATTGCACTCGTAG  469 
TGCCAGAATGGTACACAAAATCTTTTGG  TCTGGTTACTGAAGGAATCCCGGATCT  94 
CCAAAAGTTCAATACAATGGCTATGCTGG  TCATACATGGGGTTTTCAAACGATGC  188 
AAGCAAAAGTCAATATCCTTCGAAGGCAGT  AGGGACTCTCCTGCATCCCGTC  94 
GGCCTGCTATTGCAGAATACCAGCG  CCTGGTAGATGCAGGTTCCATAGCGAC  75 

CAAATGCTCTTCTGTAGAGTGACCCTTGG  CAGCCCATTGACGCTCGGTCTAC  94 
CCCAAGGCCTACACCAACTCGG  AGTCGCAGGGCAGCGAGCTGT  469 
CCGAAGTGATTTGGTTCAAGAACGACC  GCCTTTGATGGTCATGCTGACGTACT  75 
AGGCCCCTGCTGGGACCAGAGTT  GGGCTTCAGGGTGAAGTCCCG  94 
TCCCTGGACCTGCCCGACAGAC  CGGAATTCTGCCGGAAGGACG  141 
GAGCTGTACCGCGGCGTCATC  GCACCGACTGCAGGAACTCGA  469 
ATTGCAGAGTCAGAGGCGGCCG  GCGTCGAGCACTTGCCGATCT  19 
TGTGAACTAGTCCACCTGTGCGGAG  TAGGGCCCGATGACGATGCTG  141 

AGAAC AC  C  C  C AGGGAT AT AC AC  C  T  C  G  TGGCTCTGGGACTCCCGAGACT  54 
TGTCTTGGGGTATCAGATTTACAGCGTAACA  CATTTTCCTGAAAATCTGCTGCAGATTTAAG  214 
GGGGATTAAGTGGAGCTTATGGACTGC  TTTCAAACCCCACCCAAAATTCACTC  107 
GGTGGGAGGAGAGCCCGAGATC  CCCAGGTGAGACTTGCAAGTTCACG  161 
TGGCCATCTGCAAGATATGCCG  GGACATTTCCTTGGAGCTGCTCGAG  64 
GAAGCTGGAGCACAAATGTCCCG  TTCCACTTCCCAAGCTAAGCCTCG  43 
CACAGGGTGTTATTCCCATCTCATCG  AAGTCAGCCTCAGGATGCGGGT  536 

GCACCTCCTCCTCCCGGAAGAGT  CAACTGCTTCACAGGAAAGCGCA  75 
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XKR5 

NP_778250 

C8orfl5i 

CH012 

NP_689484 

NP_1035121 

Q86YV5 

CH013 

XKR6 

Q8NEP64 

Q8N852 

Q8TCU94 

MSRE 

XR_0 17857 

649548 

FDFT 

730602 

Q96LV6 

NEIL2 

Q96KT8 

SGCZ 

NP_1034551 

TUSC3 

NP_1027009 

GATA4 

BLK 

CH013 

Q96LV6 

NP_0  658  95 

CATB 

NEIL2 

CH014 


157  C  C  C  GAC  AC  C  AT  GGC  C  GAC  AT  T 
14  6  GGATGACGCCAAGGGCTCGAC 
135  AGCAGGTAGCACTGGAGCCGATC 

82  CAGAAGTACAAGGTGAAGAATGCATACCGA 
78  GCCCCTAATGGACCTGCATGGT 

172  GGAGCTGCTGCAGCGCCAGAT 
152  ACCCCATCAAGCGTATCCGCA 
132  TGCTCAACGATGCCACCTACGA 
120  T  GAC  AT  GC  C  AAGAAAGC  GAT  AC  C  C 
101  CAAATATGGATTTGGGGTGTGCGTAA 
91  TGGGCTTCCTGGGACTCGGTG 
85  GAAGCCAGGGGAACAAGGTTAAAAGG 

158  GAGCAACATGGAGAAGAGAATCCAGC 
145  GCCCTGTCTGCCTTTCTGAAACAA 

124  AGCCACAAGTGCTGCTGATGTGTA 
112  CCGGAGAATATTGACTTGGCCGT 
103  GAATATTGGATGGATCCTGAAGGCGA 

90  TGTTTGTGCCTAGCACGATTGGG 
84  AGAAGTTCCATCGAGGACAAGCCT 
78  AAGCTTGTCTTTGCCTTCACGCC 
74  ACAGTGTATGAACTCTGCGTCTGC 
71  GCCTGTGCTTCCTTCAGAGACTCA 
68  T  C AGAC  C AC  C  C AAC  TACTCTGGTA 

138  GCCACTTTACACTGTTGCTCCCAT 

125  TCTCAGAAGGCAGAGAGTGTGTCA 
114  GGATGGTCTATGCCAGAGGCTGA 
105  CACGTGGAAGAAGTGGGTGCAGAAG 

97  TGTGGCTGTTGGACGCCTGTC 
90  TCCACAGCCTTTATGCGCTACTAC 
78  AACACGTCACCGGAGAGATGATGG 
74  CAGAAGGGCCGTTGGTGAGGAAAT 
71  CCCTGAAAGTAGCAGGACAGCCTTA 


CACCAACAGCTGCATGGTGACTTAGC  75 
ACTCCGCCTCCTCCGGGCTAT  75 
GCAGTTCCAGTGTCCCGTGGTC  75 
CCAAAGAAGCAAAAGTGCTAGCACCA  125 
CAAAACCGACAGCTGGTATCGTGG  125 

CTTCGCTTTCTTTCCTTTGCTCGC  107 
CAGGGCCCGCTTCATGTCGAT  536 
TGATGTTCATCTTGGTCACGCCAATA  161 
TCTCGATATCGAATGCCTACTGCGG  107 
TGTCCTGCAGCCGCACCGATA  214 
GCTTGGAAAACCACAAAGAATGACGC  64 
TGTCACCATCTT  C AGC  C AAGC  C AC  107 

TCCCATGTCCCTGGACTGAGGAAA  102 
GGTCATCTGAAAGCCTGGGAAAGT  102 
GGATTGTTTACCCTGATGGCCAGA  68 
ACACACTCTGGTTTCTGAGTCTCG  68 
CAATGACAGAAATCTGTTCCTTCAGCTGGC  68 
AGATGGTCCTGGGTCTCTGAATAC  102 
AGAAGTATCTCTGGTCCAGCAGTGT  102 
CCCGGACCTCTTCTGATAAGGAAT  205 
TGACAAGTGGAACCTACTCCTGCT  102 
GGAGTGAAATGAAATAGTGCGGTCC  205 
AAGCAAACCTCCAACAAGCGACAC  102 

AAATGCCACTGCCACTGCTATCTG  136 
CCGGTTGATGCCGTTCATCTTGT  102 
AGTTTCCTGACCAGCCTGAGAGA  102 
GGTCTCCAGCCTCAGTTTCTCGGT  55 
CAGAGAAAGACGCAGAGTGGAAGT  205 
C  AGGAT  AC  GGAGC  T  C  T  GAC  AC  AT  T  C  136 
AGTAGGGTGTGCCATTCTCCAC  68 
CCCTGTCTTGACCACCTGCTGA  68 
CCTGGACTCACAGACTAGACTCTTGC  68 
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8 


10 


11 


BLK 

68 

CH012 

65 

Q86YV5 

100 

NP  004216 

91 

THEX1 

83 

NP  078883 

70 

MSRA 

65 

NP  919260 

61 

PNMA2 

178 

LOXL2 

163 

NP  060561 

149 

ADEC1 

136 

WRN 

124 

FGL1 

113 

VATB2 

103 

D104A 

86 

IKKB 

73 

TEX15 

195 

MCPH1 

179 

2  ABA 

164 

ADAM2 

137 

STMN4 

125 

TMM66 

114 

ADA28 

104 

UBXD6 

95 

PGFRL 

87 

ASAH1 

80 

NM  199205 

69 

LZTS1 

182 

Q8N1G8 

167 

ARHGA 

153 

ACAAGCATTTCGTGGTGGCTCTGT 

AGTGGAGGCCGGTAGTGCTGAAT 

GGAAGAGGACCATCGGACGATCTA 

CAGTGTCCAGATCAACAGCCATGT 

GCCTCACTGTGGTCTTGATGACTCTA 

TGCGCCAAGAGAGGTTTGCCTTTA 

ACATCCGGGAGGGACAGACTTTCTA 

TGATGACGCTGGGCATGGTGTT 

TCATGCACATAGTGCAGGCAGACA 

ACTCCTGATATGCCGTACTTGGACC 

TGGTGGCTGGGAAACATTCTTGTC 

CCCAAAGGATTTCAGTACATCTTGCCG 

AATTGGCATGCACTTATCCCAAGCGG 

GTGTCACTCTGCAAACCTGAATGG 

ACATTGGCTGGCAGCTACTCCGAAT 

TATGTGGTTATGGGACTGCCCGTT 

CATGAATGCCTCTCGACTTAGCCA 

GACCACAGTTGCAAGTACTGCCC 

GTTGTAGACAGGCTGGGAAAGAAGAC 

AG AAAC AC AAAG C G AG AC AT AAC C C T 

C AT T T AC C AT T C C AAAC C AAT G AG AT G G C 

AGCAGAGAAACGGGAACATGAGAGAG 

ACACCCTTCTCAGACTCGTGGTACT 

TTTGCCCTGGAAAGGACGGATAGT 

TCTACTTCCTTTCCCAGACGGCCT 

TCCCAGTGGCCCTCCCTCAACAA 

AGCCTACTTTATCCTGGGAGGCAA 

TGATGCCAACTGGTGTGGCTACTT 

AGGAGAAGGAGAAGGTGATTCAGTACC 

TACGGCCACCTTGTATCTTGTCAC 

GACCTGTCCTCCTCATCTGGGTCC 


CTTCAGCATCTGCAGGTCCCGATCATT 

TGAGAGAGGCAGGCGGAAGTCTTT 

TTCACCTACCTCGGAGTTCTGGCT 

AC T C AC AAC C AC AG G AAC T T T C C C T C 

TGATTCGGAGTTCACACCCATCCT 

AGCTGAATACAAGGCCTCAGTGGT 

GTTCTTGCTCAGGTACTGCTGGT 

AGGGTGCCGGTCAGGTTGA 

AGCAGGGTTTCTAGCCGTAACAC 

GGGAAGTCTGGGTCTCGTTTGTTT 

TCCCATACACATGCAGCTCTCGTA 

TTCTAGAAGGTGGTTCCCACACAC 

TGAGTTGACGGGAGGGTTTCGGAT 

C AG AT T T C AG AG AAT AC C AC C AC C C A 

TAATGCTTTGCAGAGTCTCGAGGG 

GCATGCATAGGTGTTGGGACATCTTC 

AGGTAAGCTGTTGGAGGCCGT 

GCTTGTGGTAATGGCTGATGAGAAGC 

GGAAGTGGTACCTTTGTTCTGTGTGC 

TGCCAGGCTGTGTGAAGGATTT 

CGCTTGAATAGTCCTCAGTTCTCCAT 

CCTCCCTGTTCTCCTTGTTGGATT 

TGAACATACCGAATAGCTGCCCGA 

CACACTTAGTTCCATTGGCCACCA 

TCCTCCAGGATGAGTACAGTGTCCA 

AGGACAGTGCAGAGCACACTGAT 

ATACATCCAATGATTCCTTTCTGTCTCGTG 

CGGATCCCAAGGGTAAGGAGGAT 

AGTGGCTATGATGTCCTCGTAGGG 

GGAGCACAGAGAGCTAGCATCAGG 

CGAGCTGGCCTTGGCTTTCTTGGC 


68 

68 


188 

75 

125 

75 

75 

313 

167 

67 
83 
83 

167 

125 

83 

83 

167 

41 

68 
136 
136 
102 

68 

102 

102 

102 

170 

205 

125 

63 

63 
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LIPL 

Q8N8I74 

BMP1 

Q16016 

ENST3563494 

Q9UDD8 

ENST3577484 

ENST338711 

CLN8 

12 

NP_079091 

PP2BC 

Q86YR2 

EFA6R 

Q7L3Y3 

GFRA2 

MTMR7 

Q8NB85 

SH24A 

ZDHC2 

SPG11 

ENST3601914 

13 

FGF17 

Q9NUU84 

SFRP1 

PIWL2 

DOK2 

CGAT1 

Q8NEE24 

CTR2 

Q71JB5 

VMAT1 

DEMA 

ENST3199164 


140  GTCTTACACACATTCAC  C AGAGGGT  C 

128  GGAGGAGATGTTGTGTGCATGTGAGA 

117  AGGTGTACTCGGCGGGAGATTCTGT 
107  AGGCCTAATCGCCACTCATCAGCAAA 

98  AAGAGACTAGTGGCATGTTGTCCC 

90  TAGAAGGCATGCTCCTGACTGTGA 

77  GAAGCTCACTGGTGTGTGTTCCTCA 

72  AAAGC AT  GGC  C AT  CATCTCCCACA 
68  GTCAGCAGCCTGTATCTGCCTCATTT 

216  GCTGCTGCAGGAGTGTCTGA 

168  AGGGTTCTCGCTTCAGCACAAGAT 

141  AGTCCCTATGCAAGTAGCCCAGTT 

129  CAGCATTCTCAAGGAAGGAGGCAA 

118  CCCTGCAGAGAGTGGTGATAACACAT 

99  TCATCCCAGGGAGTAACAAGGTGA 

91  C  C  GAC AGT  C AGT  T AC AGAT  T AC  C  T AAT  GGC 
84  TGTCCTGTGAAACCCTGTTACCTC 

78  GAGGAACCCATCACTTCCCTGG 

73  TGCAAAGCCATTGAGAGAGTCCCA 

65  AGGCTCTCGGAGAACTCAGGGAAA 
61  AGGAAGGCTCTCCTCCACATGTTT 

217  TGTTCACGGAGATCGTGCTGGAGAA 
200  CACTGCCCACCGCCCACCTCA 

169  C AC  C AGC  T  GGAC  AAC  CTCAGCCAC 

142  CTGTGCCACATGTACTGGAATTGGCCT 

130  ATATACGATGAGCCCGAGGGAGT 

119  ATGAGAAGCGCTGCATGGACGA 
10  9  AC  AAA  GAC  AC  C  C  GGAAC  C  AC  C  AC  A 

100  AC  AAT  GAAGAAGAT  GC  T  T  AT  C  C  AGAC  AAC  G 

79  TGCTGGAGGAGTTCTGCAAGGA 

74  T AT  GC AAC  C  C AGAAGC  C  C AC  GAA 
70  TCTCAAGGGTATTTGCCATGTCCC 

66  TTCCTGTCCATGTGCCTGGTCACT 


TCTGCAATCACGCGGATAGCTTCT 

TCTGTCTGCGTGAGCACATTCGCATA 

TGCTGTGGAGTGTGTCCTGGAACTT 

TCGGTTTGTTCAGAGGTAATGGAGGG 

T  GC  T  TAAAGAT T T C T T C C AAC AGAT AGC C A 

GGTGAGGATTCACGATGATTGGGT 

AAACATGGTGGCCAGCAGCATTTG 

ATTCTGCAATGTAGAGGGCGGCA 

ATGATTAGCGTAAGCAGAGCCAGTCC 

TGGCTTTGCAGGTCTTCCTCCAT 

TTCTTCCCTTGGTCGCTCCTGT 

GGAGGGAGGAATGATGATAAACCAGG 

CGCTTGACTTTGGCAGTGATTGGA 

GAAGTGGCATTTGAAACCCAGGCT 

GTTTCAGCATCAGGACAGACAGCA 

ATGTCTTACTTCTTCCAGGGCCTC 

AGTAGGAGGTGACTCCTTATGGGCA 

CAGGTAGTCAGGCAGCTGGT 

GTTTATGCTGCTCTCCGTCCAAGA 

GCGTAGCAGCTGAAACCCGTTTGT 

TTCCATTCGGCAGCGGCTTCTCTG 

GAGCCCACAAACTCGAACTGCTT 

GCTGCAGCACAAGATCCTGGAGACT 

ACTTAAACACGGACTGAAAGGTGGG 

ACAGGAAGAACAGGTTCTCGCACA 

CTGGCTGGACATGCTGGAGG 

TATCTCGTGCCT GAAC AC C AGC AT 

AGGGACCATATCCGGGAGAGC 

GGTGAACTGAGATTTCTTGGGTGATGG 

GAGACGTTCTGGCGAGGATCATGT 

TACTCCTCATGGTCAGGCTCCTCAT 

TTGAGCTCATTCCGCTTCCACAGA 

CTCAGGGCTGACAGACAGGAGGAT 


250 

125 

63 

94 

313 

63 

93 
125 
188 

156 

156 

94 
94 
94 

250 

219 

188 

250 

188 

156 

94 

115 

58 

58 

58 

87 

87 

288 

115 

87 

87 

115 

115 
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Q8WUT8 

14 

PHYIP 

GFRA2 

Q9P1G9 

LGI3 

PDLI2 

INT10 

BIN3 

Q96BB3 

XP07 

PSPC 

NP_1013864 

RP03D 

BD02 

ENST3577484 

15 

NKX31 

TR10B 

STC1 

CHMP7 

Q9Y3T6 

NFL 

TR10D 

MFRN1 

TR10A 

ADAM7 

PEBPL 

LOXL2 

16 

SCAR3 

Q6P464 

TRI35 

ENST355177 

BNI3L 


62  CCGACATGGTGGATAAGAACAAGTGC 

201  TGCATGTACACGGCCTACCACTA 
170  AACGCCATCCAGGCCTTTGGCAAC 
156  CAAGCCAGGGTTGGAAGAACCAAA 
143  GGGACGACAGAAGTTTGTACGGTT 
131  CAGGCTGTGCGCATCCAGGAG 

120  CACCACACTGTAACTCGAGGCAT 
110  ACCTGTCCCATCAGCTTGACCA 

93  GGAAGGGCAGCATCTTGATTCCAT 
8  6  ACAGTATTGTGAACAGCCAGCCAC 
80  TGGAATGCTCTCTGCAGGCCAA 
75  ACCTCCCTGAAAGGCCTCAGT 

71  CTGGGACACGTGAAGCACAAACTT 
67  ACCTGCCTTAAGAGTGGAGCCATA 

62  AGCCTACAC  C AT AAT  C  T  C AGGAT  C  C AC  G 

193  C  T  C AC  GGAGAC  C  C AAGT  GAAGAT A 
162  TAAAGGTGGCTAAAGCTGAGGCAG 
14  8  C AGAC AGAC  CACTGTGCC  C  AAAC A 
135  GAC AT  C  C  T  C  C  T  T  C AGGAT AC  C AC  C AA 
112  AACTCCTGCGTCTGGTGAAGG 
102  AAGAAGAAGGAGGTGAAGGTGAAG 
93  CACTGGAAGAAGGACATGCAAAGGA 
85  TCCAGTCCATCCACTTCATCACCT 
78  CCTGCCTCTTCTCATTACCTCTCA 

72  CACCATCTTGGTTGTTGTGCTTGTCC 
67  TTCCACCTGGGCGAACCTGAA 

63  GGTTCCTTCAGCGAAGAGACGGAAA 

191  TCAAAGGGCAGCTTTGGAACTGGA 
160  CCTGCTGCTGGTTCTTCCGAT 
146  CGCCACTGCCACCTGTACACCTTC 
133  TCATCAGAAAGTACCACAGACTGGGC 

121  GAGT  T  C  C AC  T  T  C AGAC AC  C  C  T  AAAC  G 


CACCGCTGAAGTGAATGACATGTTGC 

GTCGACGGGCTCAGTGTAGATGAT 

TGGACAGACGTGCAGGTGGTGATGA 

TGAGTCAATGGCCTTCACCTCCAT 

C  C AC  C AC  AAT  GT  GT  C  TATACACCAGC 

CGGGCATGCTTCTCACAGTACA 

AATGCAGAACCTGTGCAGAACCAC 

TCAGTCATCGGCCACAATGGAGA 

GTCCTGAATACATGGTGAGGACCAC 

ATTTCGCTCGATGCCTTCCATCAG 

TGCTGAGCCTGCATCTCGCC 

CTCGTTGCTTCTTAACCACTAATGATGGCA 

TACCGGTGTTTGTGATCCAAGAGGGA 

TGCCAATTTGTTTATACCTTCTAGGGCA 

TGAGGAACACACACCAGTGAGCTT 

CCACGCAGTACAGGTATGGGTAGTAA 

AGTGGTCCTCAATCTTCTGCTTGG 

CACTCTCATGGGATGTGCGTTTGA 

GGACAGTTTCTCAAGTTCAGCTTC 

CGGACAGCAGGCCGCTCTTTCTTT 

TCTTCTTAGCTGCTTGTTCCTCCC 

TAGCAGAGCCTGCCTCATCTTCTT 

ATGATGTGGGACTGCGGGTTGTAG 

T  C  AAAC  AAAC  AC  AAT  C  AGAAGC  AC  A 

C  GGT  AAC  GAAC  T  AAT  AGT  AT  AAGAAC  T  C  C  G 

GGGTTGGTGAGTCCTGGTAGTTCT 

CTGGTTGTTTAAGAGCCCGCTGAAGT 

ATCCCTGGTTCACCCTTAGGC 
C AAT  C  C  T  T  T  C  C AAGGC AGT  C AGAGAG 
CAGCCATCCAGTTCTTCCTTGACA 
TGCAAGCCAAATAGAGAGGCCTCAGA 
GGAAGAGAGAT  GGAAT  GAAC  AC  C  T  T  C  AG 


115 

80 

107 

80 

80 

80 

107 

107 

107 

107 

134 

268 

107 

80 

161 

125 

63 

94 

94 

94 

156 

125 

94 

94 

188 

125 

125 

94 

125 

125 

63 

188 
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CLUS 

FAK2 

ADA1A 

Q7Z2R7 

HMBX1 

KCTD9 

G0N1 

DUS  4 

Q96T53 

NP_060720 

TOPK 

Q6ZP73 

DCTN6 

COE2 

HYES 

PNOC 

DPYL2 

NP_1015508 

NP_076930 

UNC5D 

EXTL3 

WRN 

FAR  2 

LERL1 

NRG1 

PP2AB 

Q7Z4A1 

Q7Z2R7 

ZN395 

ENST332498 

NP_1 15613 
Q8NB20 


110  AGCTCTTTGACTCTGATCCCATCAC 

91  AGATGCTGACGGCTTCACACA 

83  AGGTCTGCTGCTGTGTAGGG 

76  GGTCCTCTGTGTAGGTGAATGTGTCC 

70  AGCAATCCTGGAGAGTCATGGGAT 

65  GTGATCTGTCTGGGTGTGATCTTCA 

61  GTGCGTGGAAGGCTGCTCCA 

17  6  TTAAGCAGCGCCGCAGCATCATC 

147  CAAGAAAGTGGAACCAAAGCACAGC 

122  TATTAAGGTGGAGGACACAGCCAAGG 

111  GGCCACCTATTAATATGGAAGAACTGGATG 

101  TCGGGAGTGAATACAAGGAGATGGAG 

92  TTGACAAGTGGCTGCATCATTGGG 

84  ACCGTTGGGTCTTCCAGCACAT 

71  ATCCTGATTCCGGCCCTGATGGT 

66  AATACTTGGTCCTGAGCATGCAGTCC 

62  GCACCACCCAGCGTATCGTG 

177  CTGAAGGTAAGTGAGGTGAGACCAC 

148  ACTGGCCTACCTCATGCTGTACCA 

135  T AT  T  T  C  GC  T AC AC  AAAGT AGC  C  C AT  C  T  GC 

123  GCATCAACTTCTTCGTGAAGGTGTACGG 

112  TGCACTTATCCCAAGCGGTGAAAG 

102  AGTGAGGAGTGCAAGAGGCAGAT 

93  AGCTTGTGCACTTGTTCTCACAGG 

85  AACACAAGCTCCCAGAGCAGTAAC 
78  TTGGTGTCATGATCGGAATGTGGT 

72  AAAGGAGAGAAAGGAGACAGAGCTGG 

67  TGTAGGTGAATGTGTCCCAAACCTGC 

63  TCTGCCATCCTTCCAGATCCCAGT 
60  TTATGAAGTGTTCCCAGTGCCACACT 

197  TGGACTGCAGATGCTGAAATCTCTCC 
181  GCACGCCAC  GAC AC  T  T  GAC  AAT  T  T 
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TTTGCGGTATTCCTGCAGCGCTTT  125 
TGGGCCAGATTGGCCAGAACCTT  125 
GGAGATGGTGTGGACCTTAATGGTTG  94 
C  AC  C  T  C  T  AAC  AAGAC  AGT  C  AC  AC  T  GAAA  94 
GTCGACATCATCACTGTTTGAGTGGC  94 
TATAGCTCCCTTCACGTTGGACCCT  94 
TCTCTTTCCTCCAGGGCGCAGTCCATA  125 

AAAGCTGAAGACGAACTGCGAGGT  75 
C  C  C  AGC  AAAC  GAAAC  C  AAAC  AC  C  T  113 
GCACTCTGAGCATCTCGTCATTGT  150 
CAATGTGTGCAGCAGAAGGACGAT  60 
CCTTAGGAGGAATTCTTGCATGCCCT  150 
AGGCAGTCTGCACCATAGATCACC  75 
GGCAAAGGCACTCTTCTGTTTGAC  113 
ATGTGCTGGGACATCTGAGGAAC  150 
ACACATTACCATTCTGGTGCAGGG  225 
CTAGCCCAGGCTGGTGATGTT  300 

GCATTCTTGTTCTTCACCACTGGC  144 
TCATGCTTCCAGACCCTGCC  231 
GTTTGAGAGTTTCGTGTGTGTCCTCC  144 
AGATGAACTTGAAGCACTTGGTCTTGTC  144 
TTGACGGGAGGGTTTCGGATAACA  144 
ATTGGCCAGAACCTTGGCCT  231 
CAGCTGAAGTCGTCATTGCTTCCA  115 
TCTGTATGCCCAGGAAAGGCGTAT  173 
ATAGCAGCCTGGTTCCCACAACGATA  202 
C AAC AC  CATATCCCACGGCCC  173 
C  AC  C  T  C  T  AAC  AAGAC  AGT  C  AC  AC  T  GAAAC  202 
GCAGCAGCCCAGCTGACACT  144 
TTGGCCCAATCTTGTGCTTTCCTC  144 

TGACATAGTCCATGGCCAGCTCCA  80 
C  AAC  TCGCCAGTACCTTAT  C  AAGGAG  134 
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ACHB3 

NP_859074 

ARY1 

NP_689628 

HAIR 

REEP4 

FGF20 

NP_065800 

VINEX 

NPM2 

RN122 

CNOT7 

EGR3 

KI13B 

Q8NH45 

NP_116053 

ACHA2 

RHBT2 

ARY  2 

ESC02 

DOCK5 

T2EB 

CCD25 

RBPMS 

RBM13 

731173 
137814 
NP_68  9557 
NFM 

NP_68  962  6 
Q96M97 
NP_612445 
ZN703 


166  ACGTTCACCACAGATCTTCTTCCAC 
152  TTTAAACGTGCCGTCTATGTAGC 
139  TCACCCTCACC  C AT AGGAGAT  T  C A 
127  CTTCCTCCATATCCTCCACAAGAAGC 
106  ACAGTCAGCGTCACTCAGCACTT 
97  AGTGTTGGTCAGATACTGAGGCAG 
89  TTCCGAATGCATCTTTAGGGAGCAG 
82  GTGGACAACAACACAGCATTGGT 

7  6  C  C  AGAAC  GAAGAC  GAGC  T  GGAG 

7 1  GAGGAAGAGGAAGATGATGAGGATGAGG 

67  TCTGGTGAAATGGCTGGAAGTTCG 

63  ACAGGAGGTGGCAGAACAGTTAGA 

194  GGTGACGTGGAGGCCATGTATC 
178  CTGACGTGCTGGTGCAGACGAT 
163  TGAATCTGGTCACGGTGCTGAGGAA 
14  9  TACGAAACAAAGACCTCCCTCAGC 
136  GTTTGCAGTGACCCACATGACCAA 
124  GTGGTGTTTCCCTACACAAGCAAG 
113  GGGTTTACTGTTTGGTGGGCTTCA 
103  CCTAGGGCTTGGCAATGTTCAGAT 

8  6  AGCACAAAGGCCAAAGAGTCTCCA 
7  9  T  T  AGAT  C  AGC  AT  GAC  C  AGC  GAGGA 
73  AAAGACCAAAGTCGAGCGGTTCC 

68  TACCCTCTGTACCCAGCGGAGTTA 

64  ATGGCGACATCTACAACTTCCC 

208  AGGTGATGATGACCAAACGGACCA 
191  CTACAAATGCAAGAGACAGCGCCA 
175  AAGC  C  C  AAC  C  AT  GC  AGAC  AT  C  T  T  G 
14  6  CCAGTGAAAGCAACTGCACCTGAA 
133  ATGCTAGTGGTGGTGGCAGTCAA 
121  GCCACGAGGTGTTTGAGAACTACT 
110  GAATTATGAGGATATTCAGTGTGCCACC 
100  GACAAGTCCAGCTTCAAGCCCTA 


TCGAGGACTTTGCCTTTCACTACTGG  80 
CGGAATCCTCTATGGAGTCTGTCT  268 
TGTTTGGGCACAAGCTTTCTCTGC  107 
AAAGGACACCAAATGAAGGAGCGG  107 
CATAAAGCAGGTGGCAGTCAGGG  161 
TTCCTCTTGACCACACGCAGGCT  161 
TGCGGCCAGTGTCTCCATGTTTAT  107 
GCTTGTCCATCCGAGCTTTCAA  161 
AAACCAGCCATCGTCACACTGCT  214 
ACTTGTTTGACAGGGCTTTGCTCC  161 
TGAGGGACTAGCAATGGGCTTGTT  134 
TCAGATCCTGCCTGATGTTGTGGT  214 

TGGTGGTAGAGGTTGTAGTCAGGA  144 
ACGCTGTGGGAGAAGTACCCGCTA  115 
T  GC  GAC  TGCATATC  C AGAAT  C AC  C  87 
TGGCCTCCAGAACTTCTCAGTGAT  87 
CATCTTGCAGTTCTGCTGGTCGAA  87 
AGAGGCGGTTGGCTAGAATGATGA  87 
CAGCACTTCTTCAACCTCTTCCTC  87 
GTCGTCTTGCAATGCGCTTTCTTC  87 
TCAAGGGTGTTGACTGAGGAGGTGAA  87 
CTTTCTGGGAATTGGGCAGTGCTT  87 
CTCATTCCTCTCTTCACGATCTCTGC  115 
AGTGAAGCGGGATAGGTGAAAGCA  34  6 
TCTGCCTCCTGTTGTTCCAGGG  144 

ACTTTCTTGGCTTCCCATTCAGCC  107 
TGTAGCCTCCGTAGCAAGAGTAGG  268 
CACTGATTGCATAGGCCTGGCTCA  107 
TCACTAGAGCCTTCCTTCTCGGAT  214 
GGTTGCAGAGGATGGAGGTGATGA  80 
ATGTAGGCAATGAGCACCCAGA  107 
GGAATACATTTCCAGTTGTTGTCTTAAACG  321 
AGGACGAGGAGGAGGTGGAAGACA  321 
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TR10C 
PRO  SC 
NP_0  67  64  4 
NP_079391 
SPFH2 
DB134 

22 

RFIP1 

PKHA2 

NP_1 122 1 6 

NP_060780 

NP_1019552 

GP124 

FGFR1 

TACC1 

NP_056029 

ASH2L 

LSM1 

23 

Q96ED6 
THAP1 
HTRA4 
NP_075447 
BAG  4 

NP_115872 

ADA32 

CH004 

NP_115712 

Q6ZQW0 

NP_078921 

ADAM9 

GOGA7 

24 

ACHA6 
NP  006740 


91  TCCTGCACCAT  GAC  C  AGAGAC  AC  A 

83  ACCATAGCCATCGTGGAGCACATA 
7  6  TGGAAGTGGAGCAGAAGATGCTCA 

70  TGTGGTCTTACTGAAGGCCCTCTT 

65  AAAGCTTCTCATTGCCGCCCAGAA 
61  C  T  GC AGAC  T  T  GAAT  GC  T AT  GAGAGT  G 

192  GCTCAAGCTTGGATAAACAGCTGCC 
176  TTTCAAGTGGGCCCAACTCTATCC 
161  GGAT  CTACCCACTTTACT  GAGGC A 
134  CAGTTGGTGGAGCTGGCAAATGA 
122  TGGTGTGGATCGATTCTGTTTGGG 
111  TATCCACGCTGCTCTGGATGGG 
101  TACATGATGATGCGGGACTGCT 

92  GCCAATGAAGAGATTGCTCAGGTTCG 

84  CAGAAGAGACCTCTGTGGCAGTTA 

71  TGGTGTCAATCAAGGTGTGGCT 

66  GCTGGAAGCAGAGAAGTTGAAAGTGC 

198  TGCCCTCTGCCCTGAACTCTCAT 
167  ATCTTCTGGAGCCACAGGAACAGCTT 
153  C  T  C  T  GGAT  T  GAGAGAT  C  AC  GAT  GT  A 
140  CAGTGCAGCTGTTTCCTTCTGTGA 
128  AAT  C AGGAC  C  GAC  T  GT AC  GAC  C AC AA 
117  CCATGATTGGAATGACATTTGCCT 
107  ACACCTGGCTTCTAGGTTTCCTCA 

90  ACAAGAAAGCAGAGGAGAGAGCCA 
83  GATTGACCTGGAGAAGGGCTCACA 
77  TGAGCTGCCAGTTCCTGAAGGGT 

72  TCAGACAGATACTGTGGGCTCTGTG 
68  C  C AC  AAC  C  GAAAGT AT  CATCTCAGGG 

64  G  AA  GAAAGT  C  T  C  C  AAAT  AC  AT  T  C  AAGAGC 

199  TGATGAGGTGGCCTCTGGACAA 
183  CATGACCCATGGCTCTGTGAAATC 
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TGCTACACTTCCGGCACATCTCT  80 
TGCCCAAAGCTTCCTATGGTCATC  107 
GAGGAGGAATTTCTGCCAATGAGTG  134 
CTCAGGTGTAAGGTTTGGATCCCT  107 
TCTTCCGCTCTGTCTCTGCTTCCTTT  134 
CTCCAGCTGAAACATACAGTAGGC  214 

TGCTGGTGTGGTGAGTGTCAGAAT  68 
AGGCGTGAACAAGGAGTCCTCTGAA  205 
CTAGAATTCCAAACAAGGCTTCAGG  40  9 
AATCGGGCAAGGGAACATGAAAGC  136 
GCATCAGCCCTCCAGTAATTAGCA  273 
TAGGACTGGGAGTAGGCAGAGC  205 
TTGGAGGTCAAGGCCACGATGC  205 
TCCACCTTCATCTGCTCTTTGCG  205 
CATAGTCAATGCGTTGGCCTCCAT  341 
TGTACAGTGAGATGGCTGGGAAGT  341 
TCTGCTCGAGGAATGGAAAGACCT  273 

GTAAGCCGGGTGGAAAGATGATGT  87 
TTTCCGCTGGTGCATTGTATCCTC  115 
GTTTCAGGTATGACTGTCAGGAGC  346 
ATTCTGGTGACACAGGAGCCAT  144 
TTCAGTCATGTAGAGATTGCCGGG  144 
GCTTCTGTGCAGTGGAAAGTACAAG  288 
TTCCTCTTCCTTGGCGAACCACTT  144 
TCTTCTTCAAGGCCATCAGGGCA  173 
TTAGCTGGACAGCTCCAGATGCAA  144 
ATAACCCATGGTGAGGAAGCTCAG  144 
GCCATCATAATGTTGCTGGGCCAT  115 
TATAAAGGAGGTGCAGGAGCAGGA  231 
GAGGAGGCCTTGTGGAGCATAGAT  288 

ACCACCCACTGTAATGGCTGATGA  250 
TAGTTGCTTTCCTGGGCTGGCTT  375 
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1230 

MYST3 

H00K3 

ANK1 

PFTA 

VDAC3 

DPOLB 

AP3M2 

TPA 

ENST328274 

Q6ZN93 

DKK4 

Q6XYA8 

NM_058200 

ANGP2 

DEF5 

D106A 

EDO  1 

ERIC1 

ADRB3 

PLCE 

ZN703 

P0TE8 

NP_653253 

PLCF 

GSHR 

G0GA7 


168  AGTCAAATCCCTCAGTCCGTGAGT 
154  TTAGCTGGGACTCCTCAAGCACAA 
108  AGACCTTGGTGATTTAAGGCGGCA 
99  TGCAGACGGCTCGATTGTCTCATA 
91  CGTGGTCTTTCCAAATATCCTAATCTG 
84  ATTGACTTTATCAGCTTTAATCGATGGG 
78  CTGTTACATCAGGTTGTGGAGCAGT 

73  GAGTCTTCAGGCTGGAGCTTCCAAA 
69  GCATGACTTTGGTGGGCATCATCA 

65  GTCAATGTGATCCTGGTTATGCTCCTCC 

179  TTCGCTCTGGGACTCACTCTGAAA 
150  TTTGTACTAC  GAT  GGAAGAT  GC  AAC  C  C 
125  CAGTGAGAATTGGATAGCTCCATCAGGG 
114  AGAC AT  GT  GAAC  CACTCAGCCACT 
95  CACCTTAAAGGACTTACAGGGACAGC 
87  AAGAGCTGATGAGGCTACAACCCA 
80  AGGGACATGCAAGAACAATTGCGG 

74  AGCAGTGGAGGGCAATGTCTCTAT 

195  AGAACAAGCAGAATTAGAGAAACAGCAGAG 
179  TGGTCCTGGTGTGGGTCGTGTC 
164  T  T AT  GAGT  C AC  C AGAT  C  C AGAAAGA 
150  C  T AT  GGC AAGAGC  C AC  TTATCCAC 
95  ACTTGCTGTACGTTGTGGATCAGC 
87  TGAGTGTATCAGAACTACAGGCTGC 
80  GGTGAAGGACACGTTCAAGGAGGA 
65  ATGCAGGGACTTGGGTGTGATGAA 
61  GAAGATCTATGCTCCACAAGGCCTCC 


TCTTATTCTCCTTTGGCTGCTGGC 

CTGGCATTTGCCCTTGCAATCTCT 

GCTGCGTTGGCCTTTCTTAACTCT 

GTACTTGTTCCCTGGAATGAGTGTGG 

CCACAAGAAAGGCAATTAGGTAGGG 

TAAGCTTCCAGTTCAAATCCCAAGCC 

GAACTTTGTCTCACCCTTTGACAG 

CAGCTGCTGGATCTTAAACTGC 

TGTACACACCCGGGACATCCTTCT 

ACTTCCTCCTGGTGATGACATTGC 

TGTGGTCGTTTGGAGTCCTCCTGAT 
CCTTCCTGCCTTGTGATTTCTT 
CCATAGAGCAACTC  T  GC  GAGAGAA 
TAAGAGGTCCCGTTTCACTGCGT 
CATTTGTCGTTGTCTCCATCCTTTGTGC 
GCAGAGAGTCCATTTCCTGCAAAG 
T  GGT  C  C  GAC  AGC  AT  T  T  C  AGAGAC  T 
TTCCCTCTGTAACAGGTGCCTTGA 

CTGCTGTCCTCGGGCTGGTACAT 

AAGAGGAAGGTAGAAGGAGACGGA 

TCCAGGTGTTCACATACAGCTTCC 

TGGAGGAAGAGCTGTAGTTACTGG 

CAGCAGTTTGTC  C AAAT AC AT  C  T  T  GAG 

TGAGCTGTTGTCGCAGTTGTTCCT 

AGCGGCTCCTGTCCTTGTGGTT 

GTTGCTCCCATCTTCACTGCAACA 

AACTCGCAGTCCTCGCTCAATA 


156 

156 

94 

125 

313 

375 

313 

188 

156 

313 

94 

188 

281 

188 

141 

188 

94 

281 

250 

167 

417 

417 

417 

167 

250 

208 

292 


t  The  gene  names  are  abbreviated.  Most  names  have  "_HUMAN"  at  the  end.  The  "00000"  is 
removed  from  the  numbers  of  the  ENST  names  and  "00"  is  omitted  from  the  numbers  of  the  NP- 
genes  with  7  digits. 

$  These  genes  have  been  delisted  from  the  Ensembl  Database  in  Version  43. 
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